Go back


Our powerful tools can ingest all of the world’s public git repositories turning code into ASTs ready for machine learning and other analyses, all exposed through a flexible and friendly API. We are paving the way for the future of the software development life cycle, where code becomes analyzable data powering the next generation of developer tools.

Source{d} is building the tech stack for machine learning on source code (MLoSC). We allow code to become a first-class analyzable asset across tens of millions of repositories as well as a single one through our powerful source{d} engine and machine learning tools. With access to every open source and public git repository online today, developers and organizations can understand their code as part of the complex set of dependencies it really is.

We envision every organization running a data pipeline over their software development life cycle, where source code becomes a unique, actionable dataset that can be analyzed and used in machine learning models. These are the building blocks for the next generation of new and impactful developer tools and systems that will not only change the way we learn programming, but also how we write and review code.

  • Project link


First-class analyzable code across tens of millions of repositories

Have a project in mind?

Entrepreneurship is in everything we do. Come tour our creative space and discover how to bring your ideas to life at IE.

Start your business with us