Use Python (Stackless)+MongoDB to analyze Apache logs (2G)
Why choose Stackless? Stackless can be simply considered as an enhanced version of Python, and the most eye-catching non-“micro-threading” is none other than. Micro-threads are lightweight threads. Compared with threads, switching consumes less resources, and sharing data within threads is more convenient. More concise and readable than multi-threaded code. This project is sponsored by EVE Launched online, it is really strong in terms of concurrency and performance. The installation is the same as Python, you can consider replacing the original system Python. 🙂 Why choose MongoDB? You can see that many popular applications use MongoDB on the official website, such as sourceforge, github, etc. What are the advantages over RDBMS? First of all, it has the most obvious advantages in speed and performance. It can not only be used as a KeyValue database, but also includes some database queries (distinct, group, random, index, etc.). Another feature is: simple. Whether it is an application, a document, or a third-party API, you can use it with almost a skip. However, it is a pity that the stored data files are very large, 2-4 times more than normal data. The Apache log size tested in this article is 2G, and the produced data…