[os-infrastructure] RE: [os-engineering] Model discussion boiled down

Roger L. Whitcomb Roger.Whitcomb at ingres.com
Wed Jun 25 09:38:47 PDT 2008


Andrew,
	I had a number of meetings yesterday and this morning so I
couldn't respond to your original list posting as soon as I wanted to,
but I definitely have some things to say about both of your proposals.

1. I think the "straw man" argument of only proposing two alternatives
is not fair.  Obviously the "fresh" approach will seem better given your
exposition, but there are other alternatives and you are ignoring /
presupposing some things that cannot be overlooked and that need to be
given seriou consideration.
2. I don't consider it a given that piccolo is unsuitable for community
access.  Certainly as currently constituted it is not QUITE there, but,
hey, we have the source code and I think we could do some
not-very-time-consuming modifications that could get us 95-99% "there".
And given the current (and even the forseeable future size) of our
"community committer" base, the scalability of piccolo isn't an issue
either.
3. Your definition of what "fresh" means presupposes that it must be
headrev == headrev, immediate integrations, etc. WITHOUT looking at the
procedures / processes / effort involved in doing that.  I think this is
totally unreasonable having been doing submissions, cross-integrations,
and etc.  Just this effort could consume a significant portion of our
manpower bandwidth.  And this entirely leaves out the question of IPs,
approvals, cross-integrations to other codelines, platform-build
problems, etc., etc.  This are non-trivial problems that can't be just
"assumed to work".  I think this dooms your definition of "fresh" from
the start.
4. Based on my (admittedly limited) exposure to other open-source
projects (and I have been on the developer mailing lists of two others),
the "delayed" version is not all that unreasonable.  The Apache
Foundation (for instance) has snapshots of stable releases available
(with installers and built binaries for major platforms) and the headrev
of the source available through CVS (which isn't necessarily guaranteed
to even build).  That's it.  Well, Apache is the #1 web server in the
world, so I would think this is a reasonable example of successful
open-source.  They are essentially using the "delayed" model (since CVS
is their internal source-control just as piccolo is ours).  I think this
means that a very reasonable model for community might be that we have
stable release versions available with installers, etc. periodically
(say on a 3- to 6-month schedule) and make nightly snapshots available
through CVS or SVN or whatever for whoever wants to browse / inspect the
source and then make some other mechanism available (maybe expose
piccolo somehow) to those community members who have earned "committer"
status, so that they are effectively Ingres developers.  This would be
far less disruptive and time-consuming for us to implement than the
"fresh" model, and essentially the same as one of the biggest and most
popular open-source projects around.
5. This question of "fresh" vs. "delayed" entirely begs the question of
what processes have to be developed / tested / automated in order to
successfully integrate and cross-integrate community input.  I think
this is the much more interesting / difficult question that we are
trying to address on this mailing list (and which OpenROAD is currently
actively struggling with given the commitment to make the sprint results
available in a community codeline to those sprint contributors within 30
days).
6. This question also ignores the question of what the community edition
should be.  You presume that it will be "main".... But, I'm not sure
that we want it to be.  I think we have to look at how branching would
work (both internally and on the community side) and how to manage THAT
process, given the distinct possibility of disruptive changes either
externally or internally and how to accommodate / facilitate that.  What
David Tondreau was suggesting with the RedHat / Fedora model is an
attempt to model our community process on arguably the most successful
commercial / community project around, and they definitely do NOT have
the community headrev == commercial headrev.
7. The choice of subversion as the community source-control system also
entirely begs the question of what the model should / could be of
actually managing community input.  Certainly being able to browse the
source from a web browser is convenient and SVN facilitates that, but
maybe we don't even need a community source-code control system.  Maybe
(and this is also very much in use in other open-source projects) people
who have bug fixes or new features or etc. have to submit changed files
(or an IP) to us via email or mail in a CD or whatever and we integrate
them ourselves internally in piccolo and then at some point these
changes get reflected in a community source snapshot that can be freely
browsed.  That way we don't even need a community-facing source-code
control system at all, just something that allows convenient access to
source snapshots.  The other open-source project that I am involved with
(Audacity -- a cross-platform audio editor) works that way.  There are a
handful (maybe 6 or 8) people who are the core developers and who have
CVS access and can commit stuff on their own.  Everyone else can get
read-only access through CVS or by downloading a source tarball, but any
changes they might want to make have to be submitted via email to the
developers mailing list and one of the core developers will vet the
change and maybe commit it themselves.  That's it.  There is no
community "source-code control" in that you can browse history, make
changes, add files, etc.

Anyway, I could go on, but the main point I want to make is that the
very small question of what system to use to allow the community to
"access" the source is the cart leading the horse until we figure out
these much more important questions of how it is we can actually manage
the process of integrating the community into our development efforts.

And I don't think that we want to submit a proposal to management for
them to decide on without having done the due diligence to actually
think through the processes, implement the procedures, try them out and
make sure that they are reasonable, cost-effective, and meet the real
needs of the customers and requirements that we actually have, because
if I was Bill, the first question I would have is: "Have you tried this?
Is it going to work?"

Thanks.

Roger Whitcomb | Architect, Engineering | Roger.Whitcomb at ingres.com |
Ingres | 500 Arguello Street | Suite 200 | Redwood City | CA | 94063 |
USA  +1 650-587-5596 | fax: +1 650-587-5550

-----Original Message-----
From: opensource-engineering-bounces at lists.ingres.com
[mailto:opensource-engineering-bounces at lists.ingres.com] On Behalf Of
Andrew Ross
Sent: Tuesday, June 24, 2008 7:33 PM
To: Stephen Ball; Andrew Ross; Open Source Infrastructure; Open Source
Engineering
Subject: RE: [os-engineering] Model discussion boiled down


Steve,

I agree with you about the certified binaries. It was a concern that a
number of people had so it was worthwhile discussing to find closure.
The intention of this thread was to hopefully build consensus around
which of the two models we should recommend to the executive team. That
may help towards the setting/articulating of a clear direction for the
team to follow.

The platform focus (freedom?) for community infrastructure
recommendation is something we need to come to an agreement on as well.

This is all pretty common sense based so perhaps we can sort it out. So
far we all seem to be agreeing. Let's give it at least 48 hours to let
others weigh in and make sure we haven't overlooked anything.

If we were to choose the immediate sync fresh model, occasional
collisions would be less often & cheaper to sort out than the typical
ugliness from a delayed model.

Andrew

-----Original Message-----
From: Stephen Ball
Sent: June 24, 2008 10:00 PM
To: Andrew Ross; Open Source Infrastructure; Open Source Engineering
Subject: RE: [os-engineering] Model discussion boiled down

Andrew,

Two comments:

 - Customers are almost certainly paying us for the support and patching
we provide on our certified binaries, even availability of GA code will
make very little difference to that. Whether a customer decides to pay
us for Support is unlikely to be affected by what source code we give
them, and is hardly even a consideration in this equation.

 - One other disadvantage of the "fresh" model is that it is complicated
to code and maintain. We either:
	1) work a "two phase commit" model on Piccolo and SVN, which
means a Subversion crash means piccolo is not available (or vice-versa)
	2) Work an "immediate" replication model, which leaves us open
for collisions, although unlikely.
	3) we do "two phase commit" when we can and fall back to
deferred when one of the systems is down, in which case we have to code
a way to "catch up".

Steve

-----Original Message-----
From: opensource-engineering-bounces at lists.ingres.com
[mailto:opensource-engineering-bounces at lists.ingres.com] On Behalf Of
Andrew Ross
Sent: Wednesday, June 25, 2008 6:37 AM
To: Open Source Infrastructure; Open Source Engineering
Subject: [os-engineering] Model discussion boiled down

 
Hi Everyone,
 
Delving into policy, I'd like to deconstruct discussion around our model
for open source affecting server, drivers, and more. Even though
OpenROAD has chosen a stream strategy/model, this may be helpful and
worth considering.

I'd like to share what seems to be crystallization of two (mutually
exclusive) options. This email is a request for comment. 

We intend to bring the outcome of this discussion (expected to be a
finite set of detailed choices) to the executive team to assist in
providing a clear decision.

 
Preamble:
 
We have reached consensus as a team that it is impractical for us to
move off of Piccolo (p) until some outstanding technical and workflow
issues are sorted out. There seems to be agreement that this is the
right direction and recognition that it's going to take time.
 
We also have consensus that p isn't practical for enabling community to
work with us. (it isn't visible to the public, wasn't designed to be,
etc.)
 
Thus we expect to work with p internally and Subversion (svn) for
external code access. This is expected to be the case for the remainder
of 2008. A key next step is deciding how to synchronize the two and what
content to make available publicly.

Policy for server to date has been to use main for development, a branch
off of main for stable enterprise product, and not release source post
GA.



The decision to be made:

The model refers to the relationship between what we share publicly and
when vs. what we protect.

Certified binaries are only provided to enterprise customers. That is
not expected to change.
 
A major decision revolves what content we store in svn and how fresh it
is. If we make the latest code available in svn, will we reduce
inclination of Ingres' customers to pay us for support? This presents us
with two choices of what to store in Subversion:
 
a) The "fresh" model

In this model, the latest and greatest Ingres code is mirrored between
piccolo and svn. Headrev = headrev. Changes to either side are
scrutinized with equal rigor. Changes committed to either side are
immediately propagated to the other system (with locking to avoid
conflicts). While we strive to ensure main always passes basic sanity
testing (builds, can create & start a DBMS,etc.) it is definitely an
unstable/development code base.

Long lived, or particularly disruptive changes are done in branches and
merged back when ready. Inward merges from headrev are done to keep the
code current. Ready implies testing and code inspection has passed as
well as any other process such as DDS document review has been
satisfied. Examples of branches in svn already include: geospatial,
projectD, and more.

Advantages:
- The development team can choose to work in piccolo or svn without
significant differences
- Community can do what development (working on svn) does, on the same
code vintage, with the same tools
- Long lived, innovative, or disruptive projects can work away in
isolation on a branch (allowing inward merges to keep current)
- The same rules for acceptance apply to community or internal work
- The history for each change is built up in svn from this point forward

Disadvantages:
- Is there risk to paying customers?

Commentary:  While in theory customers could run with the community
edition, GPL contamination and instability is a very real concern. I
personally feel no customer would be willing to run their
production/mission critical systems on an unstable/development codeline.


Our tar files we've been releasing contain the same code. If saving on
license fees was that important, I suspect they'd be doing this today.
The opposite is happening... people are interested in Ingres *because*
we're open source. 

As Alex pointed out to me. Issues do not necessarily equate to bugs so
there's value beyond the patches. The way I see it, we're selling
insurance based on an open source asset we own.

Bottom line: Since we can decide to change this model and shut down svn
on a whim, it would be a bad idea for anyone to depend on receiving free
patches via. our community code repository. I personally am not worried
that a fresh svn repository would hurt our business. 



b) The "delayed" model

In this model, the latest and greatest code is stored in piccolo. The
code in svn is purposely out of date. The precise delay would need to be
determined but something over 3 months, possibly up to a year seems to
make sense.

Advantages:
- Clearly, there's a large differentiator between Enterprise support and
community in terms of timeliness of patches being available.
- Community can work, albeit on stale code
- We can still work disruptive features on branches
- The acceptance rules can stay the same for community or internal work
- The history is built up in svn from this point forward

Disadvantages:
- The code is out of date, and would have to be integrated manually
which is much more expensive.
- It isn't practical for development to work from svn... They have no
choice but to work from piccolo
- Try as we may, external committers are 2nd class citizens unless we
give them VPN access. This doesn't scale.


Commentary:

In my opinion, the delayed model doesn't really work for community.
There are some things we could do as they don't depend on current code.
For most things though, it is painful to integrate many changes
developed on stale code.


I am advocating model a. I am interested in what the team thinks, and
whether there's another model we haven't considered.

Andrew
_______________________________________________
opensource-engineering mailing list
opensource-engineering at lists.ingres.com
http://lists.ingres.com/mailman/listinfo/opensource-engineering
_______________________________________________
opensource-engineering mailing list
opensource-engineering at lists.ingres.com
http://lists.ingres.com/mailman/listinfo/opensource-engineering


More information about the opensource-infrastructure mailing list