BANA 2082 Chapter Notes - Chapter 5: Unstructured Data, Data Cleansing, Data Pre-Processing
Document Summary
Almost 500 million tweets are released on the twitter online social network service every day. Many of these tweets provide valuable insights into how twitter users value the goods and services of a company. Some tweets may love a product; others may complain of quality services. In addition , the number of twitter users varies significantly (some have thousands of followers and some only) and thus these users have different levels of power. In order to develop their goods and services, tech-savvy businesses may use social media data. Web-based reviews such as amazon and yelp provide insight about how consumers experience goods and services. However, the data are not numerical in these cases. The knowledge is text: sentences, phrases, phrases, and paragraphs. Text can contain information like numerical data that can help to overcome problems and lead to better decisions. The method of extracting useful information from text data is the text mining process.