What is Text Analysis?
Definition of Text Analysis:
Text analysis is the process of sorting and analyzing data contained in text for research purposes.
Text mining entails cleaning, marking up, organizing, and parsing content of a corpus.
By using digital text analysis tools, we can easily search and examine word frequencies, patterns, and relationships.
Commonly Used Terms:
APIs (Application Programming Interfaces): Written by the owners of the content to give a clean, machine readable version of the content. Many databases or websites with large amounts of data will make their APIs available for people to reuse the data. For instance, Twitter has a public API.
Corpus/Corpora: A corpus (text) is a collection of documents, e.g. web pages, journal articles.
Crawling: A method used to automatically find links within a website, going to those links and scraping the information from those links.
Parsing: Refers to the process of (syntactic) analysis of text, i.e. identifying how a sentence follows the grammatical rules of a language. It breaks down a unit/sentence into its component parts. You can also parse files into their component parts.
Scraping: Scraping information from a website is similar to manually going to a website and highlighting and copying that information and pasting it somewhere else.
Text (and data) mining: Text mining is the data analysis of natural language works, such as articles and books, using text as a form of data. It is often joined with data mining, the numeric analysis of data works, like filings and reports, and referred to as "text and data mining" or, simply, "TDM."
List of Free Digitized Texts:
List of Text Analysis (TA) & Data Visualization (DV) Tools:
Examples of How Text Analysis is Used: