From e1eeec3e2fba449ea283ef66da6f119bb42273b0 Mon Sep 17 00:00:00 2001 From: Arnaud Porterie Date: Tue, 23 Dec 2014 17:48:14 -0800 Subject: [PATCH] Update README.md and documentation Signed-off-by: Arnaud Porterie --- README.md | 55 ++++++++++++- doc/opensprint/kickoff.md | 158 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 211 insertions(+), 2 deletions(-) create mode 100644 doc/opensprint/kickoff.md diff --git a/README.md b/README.md index bfae9ef57..b1e607a8d 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,55 @@ -distribution +Distribution ============ -The Docker toolset to pack, ship, store, and deliver content +The Docker toolset to pack, ship, store, and deliver content. + +Planned content for this repository: + +* Distribution related specifications + - Image format + - JSON registry API +* Registry implementation: a Golang implementation of the JSON API +* Client libraries to consume conforming implementations of the JSON API + +# Ongoing open sprint + +### What is an open sprint? + +The open sprint is a focused effort of a small group of people to kick-off a new project, while commiting to becoming maintainers of the resulting work. + +**Having a dedicated team work on the subject doesn't mean that you, the community, cannot contribute!** We need your input to make the best use of the sprint, and focus our work on what matters for you. For this particular topic: + +* Come discuss on IRC: #docker-distribution on FreeNode +* Submit your ideas, and upvote those you think matter the most on [Google Moderator](https://www.google.com/moderator/?authuser=1#16/e=2165c3) + +### Goal of the distribution sprint + +Design a professional grade and extensible content distribution system, that allow users to: + +* Enjoy an efficient, secured and reliable way to store, manage, package and exchange content +* Hack/roll their own on top of healthy open-source components +* Implement their own home made solution through good specs, and solid extensions mechanism. + +### Schedule and expected output + +The Open Sprint will start on **Monday December 29th**, and end on **Friday January 16th**. + +What we want to achieve as a result is: + +* Tactical fixes of today's frustrations in the existing Docker codebase + - This includes a throrough review of [docker/docker#9784](https://github.com/docker/docker/pull/9784) by core maintainers + +* Laying the base of a new distribution subsystem, living independently, and with a well defined group of maintainers. This is the purpose of this repository, which aims at hosting: + - A specification of the v2 image format + - A specification of the JSON/HTTP protocol + - Server-side Go implementation of the v2 registry + - Client-side Go packages to consume this new API + - Standalone binaries providing content distribution functionalities outside of Docker + +### How will this integrate with Docker engine? + +Building awesome, independent, and well maintained distribution tools should give Docker core maintainers enough incentive to switch to the newly develop subsystem. We make no assumptions on a given date or milestone as urgency should be fixed through [docker/docker#9784](https://github.com/docker/docker/pull/9784), and in order to maintain focus on producing a top quality alternative. + +### Relevant documents + +* [Analysis of current state and goals](docs/opensprint/kickoff.md) diff --git a/doc/opensprint/kickoff.md b/doc/opensprint/kickoff.md new file mode 100644 index 000000000..d2b69e48c --- /dev/null +++ b/doc/opensprint/kickoff.md @@ -0,0 +1,158 @@ +Distribution +========================= + +## Project intentions + +**Problem statement and requirements** + +* What is the exact scope of the problem? + + +Design a professional grade and extensible content distribution system, that allows docker users to: + +... by default enjoy: + + * an efficient, secured and reliable way to store, manage, package and exchange content + +... optionally: + + * can hack/roll their own on top of healthy open-source components + +... with the liberty to: + + * implement their own home made solution through good specs, and solid extensions mechanism + + +* Who will the result be useful to? + + * users + * ISV (who distribute images or develop image distribution solutions) + * docker + +* What are the use cases (distinguish dev & ops population where applicable)? + + * Everyone (... uses docker push/pull). + +* Why does it matter that we build this now? + + * Shortcomings of the existing codebase are the #1 pain point (by large) for users, partners and ISV, hence the most urgent thing to address (?) + * That situation is getting worse everyday and killer competitors are going/have emerged. + +* Who are the competitors? + + * existing artifact storage solutions (eg: artifactory). + * emerging products that aim at handling pull/push in place of docker. + * ISV that are looking for alternatives to workaround this situation + +**Current state: what do we have today?** + +Problems of the existing system: + +1. not reliable + * registry goes down whenever the hub goes down + * failing push result in broken repositories + * concurrent push is not handled + * python boto and gevent have a terrible history + * organically grown, under-designed features are in a bad shape (search) +2. inconsistent + * discrepancies between duplicated API (and *duplicated APIs*) + * unused features + * missing essential features (proper SSL support) +3. not reusable + * tightly entangled with hub component makes it very difficult to use outside of docker + * proper access-control is almost impossible to do right + * not easily extensible +4. not efficient + * no parallel operations (by design) + * sluggish client-side processing / bad pipeline design + * poor reusability of content (random ids) + * scalability issues (tags) + * too many useless requests (protocol) + * too much local space consumed (local garbage collection: broken + not efficient) + * no squashing +5. not resilient to errors + * no resume + * error handling is obscure or inexistent +6. security + * content is not verified + * current tarsum is broken + * random ids are a headache +7. confusing + * registry vs. registry.hub? + * layer vs. image? +8. broken features + * mirroring is not done correctly (too complex, bug-laden, caching is hard) +9. poor integration with the rest of the project + * technology discrepancy (python vs. go) + * poor testability + * poor separation (API in the engine is not defined enough) +10. missing features / prevents future + * trust / image signing + * naming / transport separation + * discovery / layer federation + * architecture + os support (eg: arm/windows) + * quotas + * alternative distribution methods (transport plugins) + +**Future state: where do we want to get?** + +* Deliverable + * new JSON/HTTP protocol specification + * new image format specification + * (new image store in the engine) + * new transport API between the engine and the distribution client code / new library + * new registry in go + * new authentication service on top of the trust graph in go + +* What are the interactions with other components of the project? + * critical interactions with docker push/pull mechanism + * critical interactions with the way docker stores images locally + +* In what way will the result be customizable? + * transport plugins allowing for radically different transport methods (bittorent, direct S3 access, etc) + * extensibility design for the registry allowing for complex integrations with other systems + * backend storage drivers API + + +## Kick-off output + +**What is the expected output of the kick-off session?** + +* draft specifications +* separate binary tool for demo purpose +* a mergeable PR that fixes 90% of the listed issues + + +* agree on a vision that allows solving all that are deemed worthy +* propose a long term battle plan with clear milestones that encompass all these +* define a first milestone that is compatible with the future and does already deliver some of the solutions +* deliver the specifications for image manifest format and transport API +* deliver a working implementation that can be used as a drop-in replacement for the existing v1 with an equivalent feature-set + +**How is the output going to be demoed?** + +docker pull +docker push + +**Once demoed, what will be the path to shipping?** + +A minimal PR that include the first subset of features to make docker work well with the new server side components. + +## Pressing matters + + * need a codename (ship, distribute) + * new repository + * new domains + + * architecture / OS + * persistent ids + * registries discovery + * naming (quay.io/foo/bar) + * mirroring + + + +## Assorted issues + + * some devops want a docker engine that cannot do push/pull +