This repository holds the tools and experiments related to analyzing dockerhub at image filesystem level. It contains the following:
dockerhub_crawler, A python tool that crawls dockerhub to retrieve data about imagesminimal_libs, A tiny tool to predict the native packages required by a given python package, based on the data fromdockerhub_crawlerresearch, notebooks of data exploration and applied machine learning experiments on the data fromdockerhub_crawler