<Corbin>
Whoo, vmprof is much easier the second time around. I was ready for it. Does vmprof_execute_code go around the JIT merge point, or do the JIT merge points go around the function decorated with vmprof_execute_code? I think I got it backwards.
antocuni has quit [Ping timeout: 256 seconds]
antocuni has joined #pypy
<cfbolz>
Corbin: sorry, I have no idea
greedom has joined #pypy
greedom has quit [Ping timeout: 240 seconds]
greedom has joined #pypy
greedom has quit [Remote host closed the connection]
greedom has joined #pypy
<arigato>
C
<arigato>
cfbolz : nowadays there is good hash randomization for strings in cpython and pypy, if I remember correctly
<arigato>
So if you have a dictionary of ints, you can str() the keys...
<cfbolz>
arigato: ah, it applies only to strings?
<arigato>
Yes, and maybe bytes and I think datetime??
<cfbolz>
Hm
<cfbolz>
So the solution would be to apply it systematically to everything?
<cfbolz>
With an xor maybe?
<arigato>
I
<arigato>
With ints, I don't think applying xor helps, because hashes with the same low bits will still have the same low bits
<cfbolz>
so you would really need to mix the bits
<cfbolz>
(which is anyway a good idea maybe given that consecutive ints are common)
<arigato>
Or just str() and then rely on the string hash, which should be strong
<cfbolz>
arigato: ok, but also the wrong complexity somehow
<arigato>
I mean, as a solution for the OP of that issu
<arigato>
Not for pypy itself, no
<cfbolz>
right
<cfbolz>
for the op, I am not sure
<cfbolz>
it sounds like they just want to use dicts directly
<arigato>
He mentions using xor but I'm not sure it helps, I think at most it changes slightly the collisions in a discoverable way
<cfbolz>
arigato: yes, but he also says: "While this is the best work around that I could think of, it also makes the code a mess. So I really don't want to have to do this."
<arigato>
Note that even if all dicts were to transform the hash they get with any bijection of the set of 64-bits ints, it doesn't help in all cases
<cfbolz>
arigato: right
<arigato>
Maybe the attacker can generate 128-bit integers of its choosing, and then he can pick many of them with the same 64-bit hash
<cfbolz>
arigato: also as soon as any info on the randomized hash from inputs you control leaks, it's over anyway
<cfbolz>
So in many ways it's just the wrong tool
<arigato>
That's why the good hash randomization works on strings in the first place, instead of on hashes
<cfbolz>
Yes
<arigato>
I think the string hash randomization is good because you can't predict which other strings will collide even if you have exemples of collision or non-collision
<arigato>
I'm not sure it helps against a determined attacker though
<arigato>
Assuming there is a probe that answers samehashbits(str1, str2, nbits)
<arigato>
... No, I don't see a way, I think the attacker needs to probe one million strings in order to get 1000 of them with the same 10 bits
<arigato>
And then he sends these 1000 strings and the program is quadratic, but he already had to do a quadratic number of probes
<arigato>
So at best it's amortized
<cfbolz>
arigato: I see
<arigato>
Maybe he can repeatedly send these 1000 strings, or something
<arigato>
Looks unpractical still
greedom has quit [Remote host closed the connection]