While on the topic of MD5ing the message/header - please note that the protocols involved do not in any way guarantee that the bits I send will be the same bits you recieve - just that they'll have the same content.
Just look at your mails headers, raw, and you'll see it's so.
In fact, significant parts of the relevant RFCs deal with how different (old) mail gateways might choose to garble your messages and how compliant (new) clients and servers should protect themselves. A good starting point is RFCs 2045-49, describing MIME.
So *maybe* MD5ing the contents, *after parsing and decoding* could work, but that's pretty ugly.
Ummm, what was the application again?
*** Bijan said : Thus, unless I'm confused or mistaken, I cannot share a categorization of a number of messages with someone else without sharing the particular message file (or subset thereof). ***
Bijan, could you clarify what you're trying to do?
Daniel
Duane Maxwell dmaxwell@san.rr.com wrote:
Tim Rowledge wrote:
Duane Maxwell wrote:
What's wrong with using an MD5 hash of the entire message? While
collisions
are possible, they're very unlikely.
Urk, how long would that take? What about one of those _really_ long messages were some dipstick quotes an entire 100kb digest, adds a 250kb code 'snippet' and dear ol'M$ s/w adds it all again in html plus one of those '.vcf' thingies?
Well, just in case the Squeak MD5 code were too slow, somebody (who shall remain nameless) wrote a Slang version for an MD5 plugin, so it wouldn't really take all that long. The computation would likely be small compared to the time to download the message in the first place.
Would hashing the message's header be plausible?
Maybe.
Is there any identifier that the pop/imap server one gets the message from can be persuaded to provide?
AFAIK, there's nothing one can rely on.
-- Duane
On Wed, 31 Oct 2001 danielv@netvision.net.il wrote:
[snip]
Ummm, what was the application again?
*** Bijan said : Thus, unless I'm confused or mistaken, I cannot share a categorization of a number of messages with someone else without sharing the particular message file (or subset thereof).
Bijan, could you clarify what you're trying to do?
[snip]
Sure. Let's say I make a category, 'foo'. Into it, I put a number of messages, sorted by hand.
Now, I want to share that category with you. Since, a category is a just a list of msgIDs, themselves bearing no inherent relation, If I just give you the category...no dice. My msgID won't match up *in any way* with yours.
Perhaps I'm being presumptuous. Normally, if you share a mail folder, I imagine you send the actual messages within. Bummer.
Email sucks! :)
But the total break between msgID and content, the arbitrariness of the association, *does* imho make categories a bit less useful. But maybe I'm hallucinating :)
Cheers, Bijan Parsia.
Bijan Parsia writes:
Sure. Let's say I make a category, 'foo'. Into it, I put a number of messages, sorted by hand.
Now, I want to share that category with you. Since, a category is a just a list of msgIDs, themselves bearing no inherent relation, If I just give you the category...no dice. My msgID won't match up *in any way* with yours.
Perhaps I'm being presumptuous. Normally, if you share a mail folder, I imagine you send the actual messages within. Bummer.
Email sucks! :)
But the total break between msgID and content, the arbitrariness of the association, *does* imho make categories a bit less useful. But maybe I'm hallucinating :)
No, it only means that message IDs must be computable from the content only, and that independent computations on the same content yield the same result. Some sort of message digest like MD5 is the answer, but only if you can guarantee the content is identical. If it isn't, I'm not sure I see the point of the exercise, since you would be sharing something other than what you think you're sharing.
-- Duane
squeak-dev@lists.squeakfoundation.org