[os-infrastructure] RE: [os-engineering] Modal discussion boiled dry
Andrew Ross
Andrew.Ross at ingres.com
Wed Jun 25 13:46:08 PDT 2008
Thanks for the clarification Sean. I still a question.
We have people engaging us today. In fact, the partners working with us
(CWI, Warwick, Ilmenau, Carleton, OSGeo, and more) are very pleased with
what we've been doing. The work here is off on a branch which is
perfectly safe. More people are talking to us every day.
Big = use a branch
Little = feel free to use main
Could you please elaborate on the issue here?
Andrew
-----Original Message-----
From: Sean Thrower
Sent: June 25, 2008 4:40 PM
To: Andrew Ross; Roger L. Whitcomb; Stephen Ball; 'Open Source
Infrastructure'; 'Open Source Engineering'
Subject: RE: [os-engineering] Modal discussion boiled dry
Andrew,
My fuller comment (a couple of emails down this email trail) says what I
wanted to say, but on the specific point you are querying:
- projects involve group agreements, large-scale planning, up-front
organisational commitment, and are mainly a continuation of Partner
Agreements by other means
- working on main means that only tiny changes get even a tryout.
Anything even a little larger, however potentially beneficial, involves
too much change/risk while it is being worked out. I cannot believe
that we would be very encouraging under those circumstances, and without
that I can't see community engaging
- working on an parallel community branch allows more ambition, more
motivation, more exchange of ideas and collaboration, more community in
fact, and we can contribute fruitfully and in a controlled way into
that, to everyone's benefit: win/win.
Regards,
Sean.
My lone inventor comment? "...all by themselves. No need for an Ingres
employee to get involved".
-----Original Message-----
From: Andrew Ross
Sent: Wednesday, June 25, 2008 8:53 PM
To: Sean Thrower; Roger L. Whitcomb; Stephen Ball; 'Open Source
Infrastructure'; 'Open Source Engineering'
Subject: RE: [os-engineering] Modal discussion boiled dry
Sean, please.
We have a large group of people working with us on Geospatial. We have a
medium and growing sized team working with us on ProjectD. Others will
do the same. For big things, we work together on a dedicated branch
until it is ready. For little things, they can work right on main. Where
does the lone inventor path comment come from?
Please help me understand what the difference is. I don't see a problem.
Andrew
-----Original Message-----
From: Sean Thrower
Sent: June 25, 2008 3:42 PM
To: Andrew Ross; Roger L. Whitcomb; Stephen Ball; 'Open Source
Infrastructure'; 'Open Source Engineering'
Subject: RE: [os-engineering] Modal discussion boiled dry
Andrew,
Ah. I see we have different concepts of community and contribution. As
you say, individuals can tread the lone inventor path with what option 1
offers, but I don't see how that can be fruitful or attractive for them
or us.
We surely need to do better than this, and we can.
I've just seen your subsequent email. There were concerns expressed
today, and there were proposals, and neither should be undervalued. I
appreciate this is a labour of Hercules, but it should be done right.
Regards,
Sean.
-----Original Message-----
From: Andrew Ross
Sent: Wednesday, June 25, 2008 8:14 PM
To: Sean Thrower; Roger L. Whitcomb; Stephen Ball; 'Open Source
Infrastructure'; 'Open Source Engineering'
Subject: RE: [os-engineering] Model discussion boiled dry
Hi Sean, Everyone.
After a community member signs the contribution agreement, and requests
their account, they can create a branch that they own, or destroy it (if
they own it), or add others who can contribute to it (if they own it),
or remove them ("), all by themselves. No need for an Ingres employee to
get involved.
There is nothing in what we've proposed that would prevent us from
helping a community member with pre-discussing the approach, asking
questions throughout, code inspecting, helping to decode test results,
and more.
Andrew
-----Original Message-----
From: Sean Thrower
Sent: June 25, 2008 2:44 PM
To: Andrew Ross; Roger L. Whitcomb; Stephen Ball; Open Source
Infrastructure; Open Source Engineering
Subject: RE: [os-engineering] Model discussion boiled dry
Andrew,
Before we all get into missionary mode/position over this, my concern is
this: Given that we are all mandated to use 10% of our time in
collaborating with the community, something I'm sure we all
wholeheartedly support (especially having recently seen how well it can
work), the effect of option 1 is surely to rob this commitment of value:
- either the community have to apply for a branch for a special project,
in order to have an environment to work with us on, or
- they can only contribute if they have already worked up their
contribution to an engineering standard all by themselves
Where is the collaboration? If I was a client, I would walk away from a
setup like that. Is that what we want? Lockstep = lockjaw?
Contrast that with a "parallel branch" approach, where community can
work out their contributions with help and contributions from us,
bringing them to a much higher quality before formally contributing
them. It enables them to start small ("useful enhancement") instead of
grandiose (the "project" approach), and this is easier for us to
collaborate with, and will surely benefit us. Timely cross-integration
from main shows we mean business, and our collaboration enables friendly
policing. We know this works, and indeed we've done it before (when
Postgres was used by Ingres in a similar role).
I don't want to oversimplify (ok, ok), but to raise a real concern that
we are making a grave mistake by pushing so nakedly for option 1. Let's
not forget what the point of community was supposed to be.
Regards,
Sean.
-----Original Message-----
From: opensource-engineering-bounces at lists.ingres.com
[mailto:opensource-engineering-bounces at lists.ingres.com] On Behalf Of
Andrew Ross
Sent: Wednesday, June 25, 2008 6:21 PM
To: Roger L. Whitcomb; Stephen Ball; Open Source Infrastructure; Open
Source Engineering
Subject: RE: [os-engineering] Model discussion boiled down
Hi Roger,
I appreciate your email but I'm not sure what to say. I'll try to be
brief to deal with some of your points as best I can.
1. Comment noted. I'm not sure if there's a better way to build
consensus. I am open minded if you'd like to suggest a better method.
Short of a better proposal, we'll keep dealing with decisions that need
to be made by defining the issue, identifying options, comparing
advantages and disadvantages, advocating the outcome to the leadership
team for a decision, and then moving forward. All are welcome to
participate in this.
I am heartened by the agreement we've seen with respect to the "fresh"
vs. "delayed" model and also in terms of platform focus for community.
Beyond this, there are many who just don't care for the discussion and
are working away at the vision we've communicated.
2. We've already reached consensus that developing piccolo further to
provide public access, an authentication model, and more is work in the
wrong direction. If you'd like to discuss one on one with me, I'd be
happy to talk with you about this.
3. We are clearly stating that code must pass the same quality criteria
on either side. The same processes are used for community work as
internal work. Please feel free to ask any specific questions in this
area. If we haven't covered something off, please let us know what it
is.
4. You've described Apache as providing headrev source, snapshots (aka.
tags) of stable milestones, and binaries for major platforms. Is it not
clear that this is what we're advocating?
5. See point 3.
6. Main = Fedora, the stable branches off main = RHEL, the other
branches are for disruptive work.
7. Yes, if we want to be successful as an open source company and build
community, we DO need a public code repository, and a public bug
tracker, and all of the things I've detailed. Please accept this.
Andrew
-----Original Message-----
From: Roger L. Whitcomb
Sent: June 25, 2008 12:39 PM
To: Andrew Ross; Stephen Ball; Andrew Ross; Open Source Infrastructure;
Open Source Engineering
Subject: RE: [os-engineering] Model discussion boiled down
Andrew,
I had a number of meetings yesterday and this morning so I
couldn't respond to your original list posting as soon as I wanted to,
but I definitely have some things to say about both of your proposals.
1. I think the "straw man" argument of only proposing two alternatives
is not fair. Obviously the "fresh" approach will seem better given your
exposition, but there are other alternatives and you are ignoring /
presupposing some things that cannot be overlooked and that need to be
given seriou consideration.
2. I don't consider it a given that piccolo is unsuitable for community
access. Certainly as currently constituted it is not QUITE there, but,
hey, we have the source code and I think we could do some
not-very-time-consuming modifications that could get us 95-99% "there".
And given the current (and even the forseeable future size) of our
"community committer" base, the scalability of piccolo isn't an issue
either.
3. Your definition of what "fresh" means presupposes that it must be
headrev == headrev, immediate integrations, etc. WITHOUT looking at the
procedures / processes / effort involved in doing that. I think this is
totally unreasonable having been doing submissions, cross-integrations,
and etc. Just this effort could consume a significant portion of our
manpower bandwidth. And this entirely leaves out the question of IPs,
approvals, cross-integrations to other codelines, platform-build
problems, etc., etc. This are non-trivial problems that can't be just
"assumed to work". I think this dooms your definition of "fresh" from
the start.
4. Based on my (admittedly limited) exposure to other open-source
projects (and I have been on the developer mailing lists of two others),
the "delayed" version is not all that unreasonable. The Apache
Foundation (for instance) has snapshots of stable releases available
(with installers and built binaries for major platforms) and the headrev
of the source available through CVS (which isn't necessarily guaranteed
to even build). That's it. Well, Apache is the #1 web server in the
world, so I would think this is a reasonable example of successful
open-source. They are essentially using the "delayed" model (since CVS
is their internal source-control just as piccolo is ours). I think this
means that a very reasonable model for community might be that we have
stable release versions available with installers, etc. periodically
(say on a 3- to 6-month schedule) and make nightly snapshots available
through CVS or SVN or whatever for whoever wants to browse / inspect the
source and then make some other mechanism available (maybe expose
piccolo somehow) to those community members who have earned "committer"
status, so that they are effectively Ingres developers. This would be
far less disruptive and time-consuming for us to implement than the
"fresh" model, and essentially the same as one of the biggest and most
popular open-source projects around.
5. This question of "fresh" vs. "delayed" entirely begs the question of
what processes have to be developed / tested / automated in order to
successfully integrate and cross-integrate community input. I think
this is the much more interesting / difficult question that we are
trying to address on this mailing list (and which OpenROAD is currently
actively struggling with given the commitment to make the sprint results
available in a community codeline to those sprint contributors within 30
days).
6. This question also ignores the question of what the community edition
should be. You presume that it will be "main".... But, I'm not sure
that we want it to be. I think we have to look at how branching would
work (both internally and on the community side) and how to manage THAT
process, given the distinct possibility of disruptive changes either
externally or internally and how to accommodate / facilitate that. What
David Tondreau was suggesting with the RedHat / Fedora model is an
attempt to model our community process on arguably the most successful
commercial / community project around, and they definitely do NOT have
the community headrev == commercial headrev.
7. The choice of subversion as the community source-control system also
entirely begs the question of what the model should / could be of
actually managing community input. Certainly being able to browse the
source from a web browser is convenient and SVN facilitates that, but
maybe we don't even need a community source-code control system. Maybe
(and this is also very much in use in other open-source projects) people
who have bug fixes or new features or etc. have to submit changed files
(or an IP) to us via email or mail in a CD or whatever and we integrate
them ourselves internally in piccolo and then at some point these
changes get reflected in a community source snapshot that can be freely
browsed. That way we don't even need a community-facing source-code
control system at all, just something that allows convenient access to
source snapshots. The other open-source project that I am involved with
(Audacity -- a cross-platform audio editor) works that way. There are a
handful (maybe 6 or 8) people who are the core developers and who have
CVS access and can commit stuff on their own. Everyone else can get
read-only access through CVS or by downloading a source tarball, but any
changes they might want to make have to be submitted via email to the
developers mailing list and one of the core developers will vet the
change and maybe commit it themselves. That's it. There is no
community "source-code control" in that you can browse history, make
changes, add files, etc.
Anyway, I could go on, but the main point I want to make is that the
very small question of what system to use to allow the community to
"access" the source is the cart leading the horse until we figure out
these much more important questions of how it is we can actually manage
the process of integrating the community into our development efforts.
And I don't think that we want to submit a proposal to management for
them to decide on without having done the due diligence to actually
think through the processes, implement the procedures, try them out and
make sure that they are reasonable, cost-effective, and meet the real
needs of the customers and requirements that we actually have, because
if I was Bill, the first question I would have is: "Have you tried this?
Is it going to work?"
Thanks.
Roger Whitcomb | Architect, Engineering | Roger.Whitcomb at ingres.com |
Ingres | 500 Arguello Street | Suite 200 | Redwood City | CA | 94063 |
USA +1 650-587-5596 | fax: +1 650-587-5550
-----Original Message-----
From: opensource-engineering-bounces at lists.ingres.com
[mailto:opensource-engineering-bounces at lists.ingres.com] On Behalf Of
Andrew Ross
Sent: Tuesday, June 24, 2008 7:33 PM
To: Stephen Ball; Andrew Ross; Open Source Infrastructure; Open Source
Engineering
Subject: RE: [os-engineering] Model discussion boiled down
Steve,
I agree with you about the certified binaries. It was a concern that a
number of people had so it was worthwhile discussing to find closure.
The intention of this thread was to hopefully build consensus around
which of the two models we should recommend to the executive team. That
may help towards the setting/articulating of a clear direction for the
team to follow.
The platform focus (freedom?) for community infrastructure
recommendation is something we need to come to an agreement on as well.
This is all pretty common sense based so perhaps we can sort it out. So
far we all seem to be agreeing. Let's give it at least 48 hours to let
others weigh in and make sure we haven't overlooked anything.
If we were to choose the immediate sync fresh model, occasional
collisions would be less often & cheaper to sort out than the typical
ugliness from a delayed model.
Andrew
-----Original Message-----
From: Stephen Ball
Sent: June 24, 2008 10:00 PM
To: Andrew Ross; Open Source Infrastructure; Open Source Engineering
Subject: RE: [os-engineering] Model discussion boiled down
Andrew,
Two comments:
- Customers are almost certainly paying us for the support and patching
we provide on our certified binaries, even availability of GA code will
make very little difference to that. Whether a customer decides to pay
us for Support is unlikely to be affected by what source code we give
them, and is hardly even a consideration in this equation.
- One other disadvantage of the "fresh" model is that it is complicated
to code and maintain. We either:
1) work a "two phase commit" model on Piccolo and SVN, which
means a Subversion crash means piccolo is not available (or vice-versa)
2) Work an "immediate" replication model, which leaves us open
for collisions, although unlikely.
3) we do "two phase commit" when we can and fall back to
deferred when one of the systems is down, in which case we have to code
a way to "catch up".
Steve
-----Original Message-----
From: opensource-engineering-bounces at lists.ingres.com
[mailto:opensource-engineering-bounces at lists.ingres.com] On Behalf Of
Andrew Ross
Sent: Wednesday, June 25, 2008 6:37 AM
To: Open Source Infrastructure; Open Source Engineering
Subject: [os-engineering] Model discussion boiled down
Hi Everyone,
Delving into policy, I'd like to deconstruct discussion around our model
for open source affecting server, drivers, and more. Even though
OpenROAD has chosen a stream strategy/model, this may be helpful and
worth considering.
I'd like to share what seems to be crystallization of two (mutually
exclusive) options. This email is a request for comment.
We intend to bring the outcome of this discussion (expected to be a
finite set of detailed choices) to the executive team to assist in
providing a clear decision.
Preamble:
We have reached consensus as a team that it is impractical for us to
move off of Piccolo (p) until some outstanding technical and workflow
issues are sorted out. There seems to be agreement that this is the
right direction and recognition that it's going to take time.
We also have consensus that p isn't practical for enabling community to
work with us. (it isn't visible to the public, wasn't designed to be,
etc.)
Thus we expect to work with p internally and Subversion (svn) for
external code access. This is expected to be the case for the remainder
of 2008. A key next step is deciding how to synchronize the two and what
content to make available publicly.
Policy for server to date has been to use main for development, a branch
off of main for stable enterprise product, and not release source post
GA.
The decision to be made:
The model refers to the relationship between what we share publicly and
when vs. what we protect.
Certified binaries are only provided to enterprise customers. That is
not expected to change.
A major decision revolves what content we store in svn and how fresh it
is. If we make the latest code available in svn, will we reduce
inclination of Ingres' customers to pay us for support? This presents us
with two choices of what to store in Subversion:
a) The "fresh" model
In this model, the latest and greatest Ingres code is mirrored between
piccolo and svn. Headrev = headrev. Changes to either side are
scrutinized with equal rigor. Changes committed to either side are
immediately propagated to the other system (with locking to avoid
conflicts). While we strive to ensure main always passes basic sanity
testing (builds, can create & start a DBMS,etc.) it is definitely an
unstable/development code base.
Long lived, or particularly disruptive changes are done in branches and
merged back when ready. Inward merges from headrev are done to keep the
code current. Ready implies testing and code inspection has passed as
well as any other process such as DDS document review has been
satisfied. Examples of branches in svn already include: geospatial,
projectD, and more.
Advantages:
- The development team can choose to work in piccolo or svn without
significant differences
- Community can do what development (working on svn) does, on the same
code vintage, with the same tools
- Long lived, innovative, or disruptive projects can work away in
isolation on a branch (allowing inward merges to keep current)
- The same rules for acceptance apply to community or internal work
- The history for each change is built up in svn from this point forward
Disadvantages:
- Is there risk to paying customers?
Commentary: While in theory customers could run with the community
edition, GPL contamination and instability is a very real concern. I
personally feel no customer would be willing to run their
production/mission critical systems on an unstable/development codeline.
Our tar files we've been releasing contain the same code. If saving on
license fees was that important, I suspect they'd be doing this today.
The opposite is happening... people are interested in Ingres *because*
we're open source.
As Alex pointed out to me. Issues do not necessarily equate to bugs so
there's value beyond the patches. The way I see it, we're selling
insurance based on an open source asset we own.
Bottom line: Since we can decide to change this model and shut down svn
on a whim, it would be a bad idea for anyone to depend on receiving free
patches via. our community code repository. I personally am not worried
that a fresh svn repository would hurt our business.
b) The "delayed" model
In this model, the latest and greatest code is stored in piccolo. The
code in svn is purposely out of date. The precise delay would need to be
determined but something over 3 months, possibly up to a year seems to
make sense.
Advantages:
- Clearly, there's a large differentiator between Enterprise support and
community in terms of timeliness of patches being available.
- Community can work, albeit on stale code
- We can still work disruptive features on branches
- The acceptance rules can stay the same for community or internal work
- The history is built up in svn from this point forward
Disadvantages:
- The code is out of date, and would have to be integrated manually
which is much more expensive.
- It isn't practical for development to work from svn... They have no
choice but to work from piccolo
- Try as we may, external committers are 2nd class citizens unless we
give them VPN access. This doesn't scale.
Commentary:
In my opinion, the delayed model doesn't really work for community.
There are some things we could do as they don't depend on current code.
For most things though, it is painful to integrate many changes
developed on stale code.
I am advocating model a. I am interested in what the team thinks, and
whether there's another model we haven't considered.
Andrew
_______________________________________________
opensource-engineering mailing list
opensource-engineering at lists.ingres.com
http://lists.ingres.com/mailman/listinfo/opensource-engineering
_______________________________________________
opensource-engineering mailing list
opensource-engineering at lists.ingres.com
http://lists.ingres.com/mailman/listinfo/opensource-engineering
_______________________________________________
opensource-engineering mailing list
opensource-engineering at lists.ingres.com
http://lists.ingres.com/mailman/listinfo/opensource-engineering
More information about the opensource-infrastructure
mailing list