... Lots of Eutopean languages use diaeresis the normal way (i.e. without the "umlaut" semantic which is a "combining e above"). the fact that German borrowed...
... Google knows the languages: it DOES use dictionnaries based on statistic relationships, and on various language indicators found when indexing pages. So...
... Hint: the above is not German. ... Wrong. Again, my browser is set to use UTF-8 encoding, Google's page use UTF-8 encoding. Searching for "päästä" and...
... The need to avoid umlauts is certainly decreasing. However there are a lot of reasons to avoid them as electronics/computer user even it just may be due to...
... The actual problem with the example code is that it uses Java and JDK classes. So certain effects that are nowhere specified in the documentation (e.g.,...
From: "Christian Biere" <christianbiere@...> ... The latter case is using a character out of the BMP, so it cannot be in Unicode 3.2 but 4.0 at least....
This subject makes me think that I alos have now a very fast implementation of the Whrilpool hash algorithm in Java, that more than quadruples the performances...
... [...] Ok, which servents other than LimeWire and derivatives have a compliant implementation that supports Unicode as discussed? I'm pretty sure that...
... What do you think about modifying the QRP hash function for better compatibility? In my opinion, it should work on the bytes of the query, not the lower 8...
From: "Philippe Verdy" <verdy_p@...> ... I should have given a related counter-example: 10400;DESERET CAPITAL LETTER LONG I;Lu;0;L;;;;;N;;;;10428; This...
... What do you mean with "worsen"? It would cause false-positives in the QRP tables, yes. But that would have been OK for a migration period. You have...
From: "Christian Biere" <christianbiere@...> ... "small lookup table" is not the correct term. It should be at least the Unicode 3.2 case mappings table, so...
From: "Christian Biere" <christianbiere@...> ... Note that QRP is only used for routing through intermediate nodes. Once a query reach each target node, QRP...
From: "Christian Biere" <christianbiere@...> ... No I computed it manually, but I have made an error... sorry. What was important in the message was the...
... For Unicode 3.2 you only need to map U+0000..U+FFFF. The table is extracted from UnicodeData.txt by using the first column as key and the 14th column as...
From: "Philippe Verdy" <verdy_p@...> ... Note that above, the character U+10400 would be actually encoded in a query string or in a query hit with UTF-8...
From: "Christian Biere" <christianbiere@...> ... Do you mean that when one searches for: "Café NOIR, SVP?", the query string will always be sent as: "cafe...
... Yes, that's how I understood it. You documented that LimeWire normalizes query strings it sends itself. I don't find anything that says that routers should...
... Yes, but the QRP hash applies only toLowerCase() to each single UTF-16 character so for that purpose the table as described above is sufficient. You don't...
Hi Everyone I am currently undertaking doctoral research related to file sharing and have been examining how the Gnutella protocol operates. My expertise is...
i just wanted to know if the udphc ggep extension block, used by hosts to identify themselves as udp host caches, sent in normal gnutella network pongs..........
Hi Madhu, SCP does not directly correlate with UDPHC. LimeWire, for example, sends out a bunch of UDP pings with the 'SCP' block to normal Gnutella clients --...
thanks a lot .... that cleared up the confusion in my mind after reading the udp host cache page on the wiki... n Just to be dead sure, can the UDPHC extension...
The gnutella protocol tries to take into account the well-being and mutual feelings of the participants. To this goal, the_gdf was founded to help new...
... UDPHC technically can appear in any pong. If it's in a normal Gnutella pong, you can't guarantee that the receiving host understands the feature, so it...