This blog post Data sets are the new server rooms makes the point that a bunch of companies raise a ton of money to go get really proprietary awesome data as a competitive moat. Because once you have the data, you can build a better product, and no one can copy it (at least not very cheaply). Ideally you hit a virtuous cycle as well, where usage of your system once it takes of gives even more data, which makes the system even better, which attracts more users…