From: Jorge Cuellar ^lt;*jrcuellarj@googlemail.com*>

Date: Thu Aug 26 2010 - 07:49:10 EDT

Date: Thu Aug 26 2010 - 07:49:10 EDT

Fine! I still have two questions:

1. Is it safe to assume that the algorithm that generates

the reported locations has a memory? It seems that the

algorithm checks somehow, (it doesn't matter how) if we are

still fine with the last reported location we still report

it, and if not, we report a new location. It would be good

if we could rely on this: you will be giving much more

information away if you did not remember what your last

reported location was.

2. Good random number generation is not easy. Shall we say

something about it? (Most hashes are fine: but hashes of

what?)

-----------------------------------------------------------

I would prefer an algorithm that tells that I am in Berlin

(say a circle with center in the center of Berlin) than one

that gives each time a different circle with a center

randomly distributed around my real location. The averages

(plus some clustering algorithms) of those values would

provide a high precision on the places I visit in Berlin.

*> The problem is that the (real) location can vary
*

*> slightly, or you might move a little. If it does vary
*

*> slightly, even if this is well below the obfuscation
*

*> distance, then the random number is completely altered.
*

*> I haven't found a good way around this particular
*

*> problem. Anything you do to stabilize the
*

*> direction/length of the shift vector only leads to the
*

*> ability to learn what the shift vector is.
*

Thus I need one algorithm that chooses out of my location a

"landmark" that is a good approximation of my location (and

of all locations in a neighborhood).

The idea is: if I am "clearly closer" to a landmark: choose it.

If not, choose the two closest landmarks. Then I am "clearly

between those two landmarks". I will choose anyone

either memoryless with prob 1/2 or with a bias toward my

previously reported location.

Consider the following algorithm sketch:

Divide the problem in two steps:

1. What are the locations I will be willing to have a

possible reported locations? The centers of those possible

reported locations are called "landmarks".

2. Given a real measurement, how to choose a landmark and

therefore a reported location?

1st step:

Given (input) a reported uncertainty that one wants to

meet, first associate to them a grid G which is a set of

points ("landmarks" or "possible reported centers") and a

set A (atlas) of regions (say, circles, the "possible

reported locations"). Notice that the associations:

uncertainty --> G = grid = Set of "landmarks" or

"possible reported locations"

uncertainty --> A = atlas = Set of "possible reported

locations"

are independent of any measured location. It is possible to

publish, for given value ranges of uncertainty, the list G

of landmarks (or, even better, the list of "blocks", see

below).

An atlas is a set of regions that cover all possible places

(locations with uncertainty 0). I would restrict the

possible set of atlases by saying that the have to have a

certain degree of coverage (or degree of overlap). Although

not necessarily every point has to be in several regions of

the atlas, the overlaps between neighboring regions has to

be relatively large. This will be discussed later in more

detail.

This can be done in many ways. For instance:

a. the "possible reported centers" could be just the points

in a grid in some geographic coordinate system (say,

latitude and longitude) with a given degree of precision

(significant digits). For instance: represent the latitude

and longitude of a point as a number between 0 and 1 in

base two (more convenient than base 10). Consider first all

points with 1 decimal, than all with two decimal places,

than all with three, etc. For each point considered (this

is one "possible reported center") create the corresponding

"possible reported locations" (which can be, say, a

circle). You continue adding points until you get the degree

of coverage wanted.

b. You can choose the grid points to be the centers of

"large" cities, towns, or other important landmarks in

maps. Again, we call those points in the maps "landmarks".

Assume you have an ordering of landmarks by importance

(say: size of the city in habitants). Given a certain

targeted uncertainty, create G and A as follows: add a

landmark as a "possible reported center" and the

corresponding "possible reported locations" as long as you

need it to cover the area in question with the expected

degree of coverage.

For many applications I myself would choose in Germany the

reported uncertainty to be, say, 100 or 300 km and the

"possible reported centers" to be the center of some "large

cities".

2nd step:

Now: given a measured location, a grid G and an atlas A,

the question is how to choose the region R in the atlas

that will be used as reported location.

You can do it like this:

if the measured location is in only one region R, report

this as the location.

Else, determine c1 and c2 the two closest landmarks, i.e,

"possible reported centers": at measured distances d1 and

d2. If d1 is "clearly smaller" than d2, choose c1, and

report the region centered at c1. "Clearly smaller" means

that the quotient d1/d2 is less than 0.8 (or other some

positive number, smaller than 1, to be chosen carefully a

priori). Similarly for d2/d1<1.3. If d1 and d2 are "about

the same", choose c1 with probability 1/2, c2 with

probability 1/2. In this case we say that the point is

"clearly between the two landmarks c1 and c2.

The idea is that each point is either "close to one

landmark" or "between two landmarks". This partitions the

whole space into connected (even: convex) "blocks" (let us

call them so). Thus all points in one block are all clearly

closer to the same landmark or clearly between two given

landmarks.

What do you think?

-- Jorge

On Tue, Aug 24, 2010 at 7:38 AM, Thomson, Martin

<Martin.Thomson@andrew.com> wrote:

*> Hi Jorge,
*

*>
*

*> My email was a little loose in its use of language, so I can understand how you might have trouble understanding this.
*

*>
*

*> More inline...
*

*>
*

*>> -----Original Message-----
*

*>> From: Jorge Cuellar [mailto:jrcuellarj@googlemail.com]
*

*>> Sent: Tuesday, 24 August 2010 1:14 AM
*

*>> To: Thomson, Martin; rjsparks@nostrum.com; Hannes.Tschofenig@gmx.net;
*

*>> geopriv@ietf.org
*

*>> Subject: Re: [Geopriv] location obscuring
*

*>>
*

*>> Hi Martin, Hannes, Robert,
*

*>>
*

*>> I would like to understand you discussions on "location
*

*>> obscuring". First, let me ask you a couple of things :
*

*>>
*

*>> 1. What is your vocabulary / notation?? Assume there is no
*

*>> uncertainity, first. You have one "real location" (?) and two
*

*>> circles, a "trigger circle" and a "reported circle" (obfuscated
*

*>> circle? obfuscated location?). The centres of those two
*

*>> circles ("trigger centre / reported centre" ??) are, I guess,
*

*>> different (if not, I see some problems).
*

*>
*

*> There are three locations:
*

*>
*

*> - the original location ("real" might be a little too strong here)
*

*> - the reported location
*

*> - the trigger location
*

*>
*

*> The first two probably have uncertainty associated with them (imagine a circle). The trigger location doesn't have uncertainty in the same way, but can also be described with a corresponding circle enclosing it. Trigger and reported are different (though there is a slim chance that they are the same...).
*

*>
*

*>> The vectors:
*

*>>
*

*>> trigger centre - real location and
*

*>> reported centre - real location
*

*>>
*

*>> let us call them "shift in reporting" / "shift for trigger" have
*

*>> lenght (or absolute value or norm) "distance to trigger centre"
*

*>> / "distance to reported centre". The direction (angle to
*

*>> some "x-axis" of those vectors are the "Direction of the shift
*

*>> in reporting" and "Direction of the shift for trigger".
*

*>
*

*> That's a good description.
*

*>
*

*>>
*

*>> I read the formula:
*

*>>
*

*>> Direction = 2 * pi * random
*

*>>
*

*>> as an angle (random is a random number on [0,1]). And
*

*>>
*

*>> direction * distance
*

*>>
*

*>> as a vector (in radians, of lenght distance at an angle direction).
*

*>
*

*> Correct; I was using shorthand - to the detriment of clarity :(
*

*>
*

*>> 2. Have you thought about the following problem? Suppose each day
*

*>> I commute from home to work. They are sufficiently apart so that
*

*>> the 2 "reported circles" have to be disjoint. Suppose I turn off
*

*>> my device during the trip, but torn it on as soon as I changed my
*

*>> location. All algorithms you have been discussing, as far as I
*

*>> see, provide every evening, when I come back home, a
*

*>> new "reported circle" centred around home plus a vector of fixed
*

*>> lenght with random direection. True?? Thus if anyone sees my
*

*>> provided locations all evenings he will know rather fast where
*

*>> exacly I live (by averaging the random shifts).
*

*>
*

*> That's one that didn't factor into this, though perhaps it should have. Though this relies on the assumption that each night you are going to the same place, that's a pretty easy conclusion to come to.
*

*>
*

*> One of the original ideas was to seed a pseudorandom number with the (real) location. A hash function would be a good substitute. This would produce the same shift vector for the same location.
*

*>
*

*> The problem is that the (real) location can vary slightly, or you might move a little. If it does vary slightly, even if this is well below the obfuscation distance, then the random number is completely altered. I haven't found a good way around this particular problem. Anything you do to stabilize the direction/length of the shift vector only leads to the ability to learn what the shift vector is.
*

*>
*

*>> 3. In the formula:
*

*>>
*

*>> Distance = sqrt(random(0, max(0, distance - existing uncertainty)))
*

*>>
*

*>> you have twice distance (one with capitals): is that intended?
*

*>> Later you use only the non-capitalized one (?)
*

*>
*

*> That's a mistake, the second is the obfuscation distance, the first is the resulting shift.
*

*>
*

*>> 4. I did not understand this:
*

*>> This is the same as below, except: (exept what??? and notation below?)
*

*>>
*

*>> reallyFuzz: function(cu) {
*

*>> var wobble = Math.max(0, this.distance - cu.uncertainty);
*

*>> var centre = this.randomize(cu.centroid, wobble);
*

*>> var unc = Math.max(cu.uncertainty, this.distance);
*

*>> this.fuzzed = new GeoShape.Circle(centre, unc);
*

*>> this.triggerPoint = this.randomize(centre, this.distance/2);
*

*>> }
*

*>
*

*> This is a replacement for the original Javascript code I used to describe the algorithm. The complete (original) is here:
*

*>
*

*> <http://www.ietf.org/mail-archive/web/geopriv/current/msg08630.html>
*

_______________________________________________

Geopriv mailing list

Geopriv@ietf.org

https://www.ietf.org/mailman/listinfo/geopriv

Received on Thu, 26 Aug 2010 13:49:10 +0200

*
This archive was generated by hypermail 2.1.8
: Thu Aug 26 2010 - 07:49:28 EDT
*