What is NLP?
If you walk to an intersection of computational linguistics, artificial intelligence, and computer science, you are more than likely to see Natural Language Processing (NLP) there as well. This is what Wikipedia says. NLP involves computers processing natural language—human-generated language and not math or Java or C++.
You’ve already seen these famous examples of NLP, I’m sure—Apple’s SIRI using speech recognition/generation, IBM Watson for question answering, and Google’s Translate based on Machine translation. For instance, if you go to Google and end up on a page in Portuguese, it asks you if you want to translate. This is NLP. It analyzes text, yes. But dealing with the “uncertainty” of human language is not easy. So, NLP has to extract meaning from the text.
Do you recall HAL from Stanley Kubrick’s film 2001: A Space Odyssey? Information retrieval, information extraction, and inference—HAL could do these tasks. Playing chess, displaying graphics, and carrying on natural conversations with humans were fascinating when this film was made in 1967. And now? NLP manifests itself in Microsoft Cortana, Palantir, Summly, and Facebook graph search. Wow.
NLP includes Natural Language Generation (NLG) and Natural Language Understanding (NLU). When your computer can write like you, a human, can, that’s NLG—personalized with variety and emotion…Understanding the meaning of written text and producing data which embodies this meaning is NLU; you need to manage ambiguities here.
What makes up NLP?
Let’s look at the major components of NLP.
Entity extraction involves segmenting a sentence to identify and extract entities, such as a person (real or fictional), organization, geographies, events, etc. NLP APIs use online data from sources like Wikipedia or other repositories to match these entities. One of the main challenges is to match different variations of an entity and cluster it as the same.
For example, assume that there is an entity called Howard Roark. In a given article, the variations for this entity could include Roark, Mr. Roark, Howard Roark, and so on. The algorithm should be able to identify and cluster all these variations.
Entity extraction has two important components
- Entity type: Person, place, organization, etc.
- Salience: Importance or the centrality of an entity on the scale of 0 to 1 (these scores indicate the relevance of the entity to the entire text, with scores closer to 1 being more important than those closer to 0)
This is a snippet of the entire analysis.
Note that “Karna, with the entity type ‘person,” has been analysed with a salience of 0.5/1. This takes into consideration the number of occurrences across the text with an overall centrality in the context of the entire text.
Let me tell you that Entities/Persons are in red—Karna and Duryodhana. In the same text, “world/Hastinapur” will be location, “exile/allegiance” will be other, “army” will be organization, and “military campaign” will be event, and so on. You get the idea, right?
Syntax refers to the proper ordering of words. Do the words you’ve put together form a “correct” sentence? It deals with the structural roles of words in the sentence. And then you use a parsing algorithm to produce a “tree,” which gives you the syntactic relationships between the constituents according to context-free grammar. (This is a good video. Warning! Quality not so great.) In sentence extraction, text is broken up into sentences. In tokenization, the text is broken up into tokens (like words or punctuation in natural language) to which syntactic information is added by natural language API. The tokens are put in this dependency tree you see below. This analysis includes parts of speech tagging, chunking, and sentence assembling.
We’ll take a part of the paragraph from Mahabharata we used in the example above. You see that the words are parsed into the “parts of speech” based on general grammar rules in the language. (In the example, root is the main verb in the sentence; Karna has a noun subject relationship to “embarks” and upon is a prepositional one.)
After a sentence is parsed to extract entities and understand the syntax, semantic analysis concludes the meaning of the sentence in a context-free form as an independent sentence. The inferred meaning may not be the actual intent of the implied meaning. After a sentence is parsed to extract entities and understand the syntax, semantic analysis concludes the meaning of the sentence in a context-free form as an independent sentence. The inferred meaning may not be the actual intent of the implied meaning.
In the sentence, “Karna had a crossbow”, the computer infers that “had” means “owns’. Therefore, “Karna had an apple” may be perceived as “Karna owned an apple” and not “Karna ate an apple”. The computer may be “confused” because of the grammar rules. It requires a certain knowledge of the world to “understand” the real meaning of the sentence.
As you can see in the Syntactic analysis image below, the words are interconnected. For example, the computer identifies the root verb and connects it to the noun “qualities,” and so on. This part deals with lexical semantics that determine the connection between words to conclude the meaning of the sentence. Lexical semantics refers to the meaning of component words. It includes word sense disambiguation, for example, country can refer to a nation you belong to or your favorite genre of music. “How words combine to form larger meanings” is compositional semantics; it is typically expressed as logic.
So relevance will be the core of semantic analysis then.
Once the syntactic and semantic analysis has been completed, we try to understand the sentiment behind every sentence. Sentiment will include emotions, opinions, and attitudes. We are talking subjective impressions and not facts. This is also referred to as opinion mining (a powerful tool in social media). For example, to determine whether a review is positive or negative, you use magnitude (extent of emotional content in the text) and scores (overall emotion of the text). Polarity values for positive content will be +1 and that for negative content will be −1. A document with a score of 0.3 and a magnitude of 3.8 will be slightly positive with an appreciable level of emotion. If your document is long, the magnitude value is likely to be high.
Once again, we will use our Karna example.
Note that your polarity range is your score range.
This means that you input all the versions of Mahabharata written by different authors, segregate the characters, and average out the overall sentiment to analyze how Karna as a character is widely perceived.
Practical applications in strategies related to brand watch are aplenty. The graph you see below is a good example. It shows the sentiments for media articles, for a company X after the launch of the product Y, using data of all the press releases. -1 and 1 represent the extreme negative and positive sentiments, respectively. The average sentiment has been calculated as a product of polarity and magnitude.
If you go to your editor and ask her to suggest a better sentence structure for a line, her immediate question to you will be, “What’s the context?” Most of the time, due to flexibility of the natural language, complexities arise in interpreting the meaning of an isolated statement. Pragmatic analysis uses the context of utterance—when, why, by who, where, to whom something was said. It deals with intentions like criticize, inform, promise, request, and so on. For example, if I say “You are late,” is it information or criticism? In discourse integration, the aim is to analyze the statement in relation to the preceding or succeeding statements or even the overall paragraph in order to understand its meaning. Take this one: Chloe wanted it. (“It” depends on Chloe). Pragmatic analysis interprets the meaning in terms of context of use unlike semantics.
But the perspectives NLP and linguistics have about pragmatics are essentially different.
A few applications of NLP
What I can gather from my research is that spam detection, parts of speech tagging, and named entity recognition are mostly solved, question answering, dialog, summarization, and paraphrase are still rather hard. Information extraction, machine translation, word sense disambiguation, parsing, co-reference resolution, and sentiment analysis are making good progress. Take a look at this video to understand these terms better.
You’ve already seen NLP in practice:
- Companies using AI chatbots that give you suggestions to locate the nearest grocery store, book a movie ticket, order food, etc.
- Sentiment analysis during a political campaign to take informed decisions by monitoring trending issues on social media
- Analyzing lengthy text reviews by users of products on an e-commerce website
- Call centers using NLP to analyze the general feedback of the callers
Different APIs that are customized with different parameters are available based on the problem that you are trying to solve. Advanced NLP algorithms use Statistical Machine Learning along with Deep Analytics that enables us to efficiently deal with unstructured data. The flexibility of natural language that humans use may be challenging for computers to interpret with regular grammar rules and semantics. Despite this, we have made significant progress in NLP.
I think Alan Turing would be proud of the fantastic leaps we’ve already taken since 1950, what say?
Also published on Medium.