In the field of natural language processing (NLP), data is king. The more data you have, the better your results. Most new research is freely accessible these days and, thanks to the cloud, there is unlimited computing power at our disposal. What keeps an NLP researcher from achieving state-of-the-art results despite this is the lack […] The post Extracting Data from Common Crawl Dataset appeared first on QBurst Blog.