We believe the soul of BigCode to be clear and transparent communication striving towards open collaboration. The project, therefore, runs under the following set of open and permissive licenses. Datasets. We value openness and transparency about the training data of LLMs and intend to release datasets whenever we have the rights to do so. We will also provide data cards for all datasets we release. Please see the Dataset Card for The Stack.| BigCode
We are excited to invite AI practitioners from diverse backgrounds to join the BigCode project! Note that BigCode is a research collaboration and is open to participants who have a professional research background and are able to commit time to the project. In general, we expect applicants to be affiliated with a research organization (either in academia or industry) and work on the technical/ethical/legal aspects of LLMs for coding applications.| BigCode
SPDX License List| spdx.org