Re: [Geopriv] location obscuring

From: Richard L. Barnes ^lt;rbarnes@bbn.com>
Date: Fri Aug 27 2010 - 13:09:44 EDT

Hey Jorge,

I enjoyed your ASCII art, and this is an interesting idea, but seems
to have a lot more complexity? Could you clarify what you think the
benefit of this system is over the one Martin had proposed? I
actually think it's quite a bit worse, in that it misses the point of
Martin's algorithm.

Here's my thinking:

1. There's not really any point to choosing the area returned
probabilistically, since you can get both areas with repeated
queries. You might as well return the union of the two hexagons.

2. There's actually no point to returning the second hexagon either,
since a moving target will reveal which hexagon he was in. For
example, zooming in on the a piece of the lattice, suppose an entity X
moves as indicated:

   \ A /
    \ /
     \ /
      +
   X---->X
B | C
      |
      +
     / \
    / \
   / D \

At first, the algorithm will return A+B, and afterward, it will return
A+C, so an observer can deduce that the target has moved from B to C,
and can discard A as a possible location.

3. Even worse, based on the timing of the change, the observer can
tell that the target is somewhere near the boundary (H below),
obviously an area much smaller than A, B, C, or D, much less A+B or A+C.

   \ A /
    \ /
     \ /
      +
      HX
B H C
      H
      +
     / \
    / \
   / D \

This is exactly the problem that Martin's algorithm solves with the
"hidden trigger" concept. That feature of the algorithm prevents an
observer from isolating the target to a boundary, so the problem (3)
above doesn't arise.

--Richard

On Aug 27, 2010, at 12:08 PM, Jorge Cuellar wrote:

> There is a simpler version of (a slight modification) of
> the algorithm I proposed with the grid.
>
> The idea again is to have a set of "landmarks" that are the
> "centers" of regions used as "possible reported locations".
>
> Consider the area you want to tile to be a flat surface. (A
> tessellation or tiling of a surface is a collection of
> figures that fills the original one with no overlaps or
> gaps). We are interested on tiling with approximately
> regular triangles of approximately the same size. The flat
> case this is easy: tile the plane in the standard way with
> regular (equilateral) triangles of the same size. In this way
> you get, with a suitable choice of coordinates, the grid
> points:
>
> (e x n, d x m) (n, and m range over the integers,
> with n+m = even
> e and d are numbers with
> d/e =((square root of three) /2),
> where "x" represents multiplication and
> "/" division).
>
> An example is given in the figure below, where the grid
> points are shown as asterisks "*". Thus, the points have
> coordinates (starting, say, at the bottom left with (0,0):
>
> (0,0), (0,2d), (0,4d), ...
> (e,d), (e,3d), (e,5d), ...
> (2e,0), (2e,2d), (2e,4d), ...
> (3e,d), (3e,3d), (3e,5d), ...
>
> Those points G ("*") are the "landmarks".
>
> * * *
>
>
>
>
>
>
>
> * * * *
>
>
>
>
>
>
>
> * * *
>
>
>
>
>
>
>
> * * * *
>
> Insert now the midpoints of the sides of the triangle. They
> are(shown in the figure below as dots, ".". Each of those
> points is equidistant to exactly two landmarks.
>
>
> . * . * . * .
>
>
>
> . . . . . .
>
>
>
> * . * . * . *
>
>
>
> . . . . . .
>
>
>
> . * . * . * .
>
>
>
> . . . . . .
>
>
>
> * . * . * . *
>
> Let us call S the set of all those points "+" or "*". Now
> consider, for each point in S, the set of points that are
> closest to a point x in S than to any other point of S.
> (This is the Voronoi diagram of S). This divides the region
> into hexagons as shown in the figure below. Each oint in
> the interior of a hexagon is closet to the center of the
> hexagon than to any other point "+" or "*".
>
> There are two types of hexagons: ones whose center is a
> landmark ("*") and ones whose center "." is equidistant to
> two landmarks. The points in the first type of hexagons are
> said to be "clearly closer to a landmark" (the center of the
> hexagon), while the points in the other type of hexagon are
> said to be "clearly between two landmarks".
>
> . | * | . | * | . | * | .
> + + + + + +
> / \ / \ / \ / \ / \ / \
> + + + + + + +
> | . | . | . | . | . | . |
> + + + + + + +
> \ / \ / \ / \ / \ / \ /
> + + + + + +
> * | . | * | . | * | . | *
> + + + + + +
> / \ / \ / \ / \ / \ / \
> + + + + + + +
> | . | . | . | . | . | . |
> + + + + + + +
> \ / \ / \ / \ / \ / \ /
> + + + + + +
> . | * | . | * | . | * | .
> + + + + + +
> / \ / \ / \ / \ / \ / \
> + + + + + + +
> | . | . | . | . | . | . |
> + + + + + + +
> \ / \ / \ / \ / \ / \ /
> + + + + + +
> * | . | * | . | * | . | *
>
> Now, given a point in the plane choose either the landmark
> clearly closest to it or one of the two closest each with
> probability 1/2.
>
> There is a variant of this for a square grid (but having up
> to 4 "closest landmarks"). It is relatively straight-forward
> to give the formulas for the landmarks associated with a
> point in the plane. I will expand on this on the following
> days.
>
> Ciao, Jorge
>
> On Thu, Aug 26, 2010 at 1:49 PM, Jorge Cuellar
> <jrcuellarj@googlemail.com> wrote:
>> Fine! I still have two questions:
>>
>> 1. Is it safe to assume that the algorithm that generates
>> the reported locations has a memory? It seems that the
>> algorithm checks somehow, (it doesn't matter how) if we are
>> still fine with the last reported location we still report
>> it, and if not, we report a new location. It would be good
>> if we could rely on this: you will be giving much more
>> information away if you did not remember what your last
>> reported location was.
>>
>> 2. Good random number generation is not easy. Shall we say
>> something about it? (Most hashes are fine: but hashes of
>> what?)
>>
>> -----------------------------------------------------------
>> I would prefer an algorithm that tells that I am in Berlin
>> (say a circle with center in the center of Berlin) than one
>> that gives each time a different circle with a center
>> randomly distributed around my real location. The averages
>> (plus some clustering algorithms) of those values would
>> provide a high precision on the places I visit in Berlin.
>>
>>> The problem is that the (real) location can vary
>>> slightly, or you might move a little. If it does vary
>>> slightly, even if this is well below the obfuscation
>>> distance, then the random number is completely altered.
>>> I haven't found a good way around this particular
>>> problem. Anything you do to stabilize the
>>> direction/length of the shift vector only leads to the
>>> ability to learn what the shift vector is.
>>
>> Thus I need one algorithm that chooses out of my location a
>> "landmark" that is a good approximation of my location (and
>> of all locations in a neighborhood).
>>
>> The idea is: if I am "clearly closer" to a landmark: choose it.
>> If not, choose the two closest landmarks. Then I am "clearly
>> between those two landmarks". I will choose anyone
>> either memoryless with prob 1/2 or with a bias toward my
>> previously reported location.
>>
>> Consider the following algorithm sketch:
>>
>> Divide the problem in two steps:
>>
>> 1. What are the locations I will be willing to have a
>> possible reported locations? The centers of those possible
>> reported locations are called "landmarks".
>>
>> 2. Given a real measurement, how to choose a landmark and
>> therefore a reported location?
>>
>> 1st step:
>>
>> Given (input) a reported uncertainty that one wants to
>> meet, first associate to them a grid G which is a set of
>> points ("landmarks" or "possible reported centers") and a
>> set A (atlas) of regions (say, circles, the "possible
>> reported locations"). Notice that the associations:
>>
>> uncertainty --> G = grid = Set of "landmarks" or
>> "possible reported locations"
>>
>> uncertainty --> A = atlas = Set of "possible reported
>> locations"
>>
>> are independent of any measured location. It is possible to
>> publish, for given value ranges of uncertainty, the list G
>> of landmarks (or, even better, the list of "blocks", see
>> below).
>>
>> An atlas is a set of regions that cover all possible places
>> (locations with uncertainty 0). I would restrict the
>> possible set of atlases by saying that the have to have a
>> certain degree of coverage (or degree of overlap). Although
>> not necessarily every point has to be in several regions of
>> the atlas, the overlaps between neighboring regions has to
>> be relatively large. This will be discussed later in more
>> detail.
>>
>> This can be done in many ways. For instance:
>>
>> a. the "possible reported centers" could be just the points
>> in a grid in some geographic coordinate system (say,
>> latitude and longitude) with a given degree of precision
>> (significant digits). For instance: represent the latitude
>> and longitude of a point as a number between 0 and 1 in
>> base two (more convenient than base 10). Consider first all
>> points with 1 decimal, than all with two decimal places,
>> than all with three, etc. For each point considered (this
>> is one "possible reported center") create the corresponding
>> "possible reported locations" (which can be, say, a
>> circle). You continue adding points until you get the degree
>> of coverage wanted.
>>
>> b. You can choose the grid points to be the centers of
>> "large" cities, towns, or other important landmarks in
>> maps. Again, we call those points in the maps "landmarks".
>> Assume you have an ordering of landmarks by importance
>> (say: size of the city in habitants). Given a certain
>> targeted uncertainty, create G and A as follows: add a
>> landmark as a "possible reported center" and the
>> corresponding "possible reported locations" as long as you
>> need it to cover the area in question with the expected
>> degree of coverage.
>>
>> For many applications I myself would choose in Germany the
>> reported uncertainty to be, say, 100 or 300 km and the
>> "possible reported centers" to be the center of some "large
>> cities".
>>
>> 2nd step:
>>
>> Now: given a measured location, a grid G and an atlas A,
>> the question is how to choose the region R in the atlas
>> that will be used as reported location.
>>
>> You can do it like this:
>> if the measured location is in only one region R, report
>> this as the location.
>>
>> Else, determine c1 and c2 the two closest landmarks, i.e,
>> "possible reported centers": at measured distances d1 and
>> d2. If d1 is "clearly smaller" than d2, choose c1, and
>> report the region centered at c1. "Clearly smaller" means
>> that the quotient d1/d2 is less than 0.8 (or other some
>> positive number, smaller than 1, to be chosen carefully a
>> priori). Similarly for d2/d1<1.3. If d1 and d2 are "about
>> the same", choose c1 with probability 1/2, c2 with
>> probability 1/2. In this case we say that the point is
>> "clearly between the two landmarks c1 and c2.
>>
>> The idea is that each point is either "close to one
>> landmark" or "between two landmarks". This partitions the
>> whole space into connected (even: convex) "blocks" (let us
>> call them so). Thus all points in one block are all clearly
>> closer to the same landmark or clearly between two given
>> landmarks.
>>
>> What do you think?
>> -- Jorge
>>
>> On Tue, Aug 24, 2010 at 7:38 AM, Thomson, Martin
>> <Martin.Thomson@andrew.com> wrote:
>>> Hi Jorge,
>>>
>>> My email was a little loose in its use of language, so I can
>>> understand how you might have trouble understanding this.
>>>
>>> More inline...
>>>
>>>> -----Original Message-----
>>>> From: Jorge Cuellar [mailto:jrcuellarj@googlemail.com]
>>>> Sent: Tuesday, 24 August 2010 1:14 AM
>>>> To: Thomson, Martin; rjsparks@nostrum.com; Hannes.Tschofenig@gmx.net
>>>> ;
>>>> geopriv@ietf.org
>>>> Subject: Re: [Geopriv] location obscuring
>>>>
>>>> Hi Martin, Hannes, Robert,
>>>>
>>>> I would like to understand you discussions on "location
>>>> obscuring". First, let me ask you a couple of things :
>>>>
>>>> 1. What is your vocabulary / notation?? Assume there is no
>>>> uncertainity, first. You have one "real location" (?) and two
>>>> circles, a "trigger circle" and a "reported circle" (obfuscated
>>>> circle? obfuscated location?). The centres of those two
>>>> circles ("trigger centre / reported centre" ??) are, I guess,
>>>> different (if not, I see some problems).
>>>
>>> There are three locations:
>>>
>>> - the original location ("real" might be a little too strong here)
>>> - the reported location
>>> - the trigger location
>>>
>>> The first two probably have uncertainty associated with them
>>> (imagine a circle). The trigger location doesn't have uncertainty
>>> in the same way, but can also be described with a corresponding
>>> circle enclosing it. Trigger and reported are different (though
>>> there is a slim chance that they are the same...).
>>>
>>>> The vectors:
>>>>
>>>> trigger centre - real location and
>>>> reported centre - real location
>>>>
>>>> let us call them "shift in reporting" / "shift for trigger" have
>>>> lenght (or absolute value or norm) "distance to trigger centre"
>>>> / "distance to reported centre". The direction (angle to
>>>> some "x-axis" of those vectors are the "Direction of the shift
>>>> in reporting" and "Direction of the shift for trigger".
>>>
>>> That's a good description.
>>>
>>>>
>>>> I read the formula:
>>>>
>>>> Direction = 2 * pi * random
>>>>
>>>> as an angle (random is a random number on [0,1]). And
>>>>
>>>> direction * distance
>>>>
>>>> as a vector (in radians, of lenght distance at an angle direction).
>>>
>>> Correct; I was using shorthand - to the detriment of clarity :(
>>>
>>>> 2. Have you thought about the following problem? Suppose each day
>>>> I commute from home to work. They are sufficiently apart so that
>>>> the 2 "reported circles" have to be disjoint. Suppose I turn off
>>>> my device during the trip, but torn it on as soon as I changed my
>>>> location. All algorithms you have been discussing, as far as I
>>>> see, provide every evening, when I come back home, a
>>>> new "reported circle" centred around home plus a vector of fixed
>>>> lenght with random direection. True?? Thus if anyone sees my
>>>> provided locations all evenings he will know rather fast where
>>>> exacly I live (by averaging the random shifts).
>>>
>>> That's one that didn't factor into this, though perhaps it should
>>> have. Though this relies on the assumption that each night you
>>> are going to the same place, that's a pretty easy conclusion to
>>> come to.
>>>
>>> One of the original ideas was to seed a pseudorandom number with
>>> the (real) location. A hash function would be a good substitute.
>>> This would produce the same shift vector for the same location.
>>>
>>> The problem is that the (real) location can vary slightly, or you
>>> might move a little. If it does vary slightly, even if this is
>>> well below the obfuscation distance, then the random number is
>>> completely altered. I haven't found a good way around this
>>> particular problem. Anything you do to stabilize the direction/
>>> length of the shift vector only leads to the ability to learn what
>>> the shift vector is.
>>>
>>>> 3. In the formula:
>>>>
>>>> Distance = sqrt(random(0, max(0, distance - existing
>>>> uncertainty)))
>>>>
>>>> you have twice distance (one with capitals): is that intended?
>>>> Later you use only the non-capitalized one (?)
>>>
>>> That's a mistake, the second is the obfuscation distance, the
>>> first is the resulting shift.
>>>
>>>> 4. I did not understand this:
>>>> This is the same as below, except: (exept what??? and notation
>>>> below?)
>>>>
>>>> reallyFuzz: function(cu) {
>>>> var wobble = Math.max(0, this.distance - cu.uncertainty);
>>>> var centre = this.randomize(cu.centroid, wobble);
>>>> var unc = Math.max(cu.uncertainty, this.distance);
>>>> this.fuzzed = new GeoShape.Circle(centre, unc);
>>>> this.triggerPoint = this.randomize(centre, this.distance/
>>>> 2);
>>>> }
>>>
>>> This is a replacement for the original Javascript code I used to
>>> describe the algorithm. The complete (original) is here:
>>>
>>> <http://www.ietf.org/mail-archive/web/geopriv/current/
>>> msg08630.html>
>>
> _______________________________________________
> Geopriv mailing list
> Geopriv@ietf.org
> https://www.ietf.org/mailman/listinfo/geopriv

_______________________________________________
Geopriv mailing list
Geopriv@ietf.org
https://www.ietf.org/mailman/listinfo/geopriv
Received on Fri, 27 Aug 2010 13:09:44 -0400

This archive was generated by hypermail 2.1.8 : Fri Aug 27 2010 - 13:10:02 EDT