2011.06.30 - ETL Loose Ends
This set of remaining ETL issues has ramifications that go well beyond the ETL itself.
The Right of Way Address issue a a good example (details below).
With that in mind I think it is vitally important that we focus on this question...
what is the bare minimum set of features and data that we need to make a viable 1.1 release?
service addresses
Description
There are a number of AVS addresses that use a non-existent block lot.
During the ETL to EAS these are generating exceptions of the type "no matching block - lot".
We are waiting for DBI to confirm how we can identify these addresses (e.g. lot = '000'?).
We expect that most of these addresses will have a valid street and street number.
We are calling these addresses service addresses.
Conceptually the exist in the "right of way" but we may not model them in geographic space that way.
The EAS application is designed to prevent addresses from being created outside a parcel boundary.
We were not planning on supporting this type of address until a later release.
Other agencies such as DPW wants this (or similar) feature as well.
Option 1 (increase scope)
It appears that there will be 2 chunks of work here.
First we will have to introduce the notion of a "service parcel".
The service parcel would be a citywide parcel with a unique block-lot (e.g. 0000-000) that would support service addresses at arbitrary locations - including the right of way.
The web application behavior may be imperfect in some edge cases (user adds an address in the middle of the bay) but this may not require any changes if the application behavior is reasonable.
I think this work will take less than a day.
The second challenge will be to write some additional ETL code - possibly non-trivial.
I think this work may take as many as 3 days.
Option 2 (maintain scope)
Defer loading these addresses from AVS until the next release.
I assume we would have to focus on this issue in the very next release.
EAS missing some streets
When anyone finds that streets are missing from EAS or are incorrect,
EAS users shall notify Marivic at DPW. Maravic will make the corrections
which should appear in EAS the following day.
no such street name - number is non-zero (SITUS TO BE ASSIGNED, UNKNOWN)
We can supply the correct street and street type.
Val, I assume that we should simply build the SQL for this...something like:
- update addresses set street_name = x, street_type = y, where id = z;
no such street name - number is zero (SITUS TO BE ASSIGNED, UNKNOWN)
Option 1
- create a dummy street record for the UNKNOWN street
- duplicates are not allowed, so we would have to create 0..N UNKNOWN
- we would provide update statements such as
- update addresses set number where id = x;
Option 2
We can programatically select a good approximation.
Details
- query to get the street nearest the parcel centroid
- query to find the point on the street that is nearest the parcel centroid
- use that point and interpolation to choose the best address number
- no duplicates!
- spatial query to determine right/left side (then even/odd number)
- we would provide update statements such as
- update addresses set number, street, street_type where id = x;
Bear in mind here that this solution means at least 2 iterations of of the ETL process
First time through these addresses get excepted - and we generate the SQL.
The SQL is then applied to AVS.
On iteration 2 the rows are imported.
I like option 2 - seems likes a cleaner solution.
Amount of work for either option seems like about 2-3 days.
invalid street suffix
We agreed that in specific limited circumstances we would update street suffixes in AVS.
- e.g. Stanyan Blvd vs Stanyan St (nearest one is Blvd but AVS says St - we should choose BLVD)
- e.g. Broadway vs Broadway St (there is no Broadway St - only alternative is Broadway)
Right now the ETL generates SQL but the SQL is not directly usable because AVS uses char(2) while EAS uses char(3 or 4?).
Because of this the SQL may not be useful.
Val:
Is there something else we can do here instead of this SQL?
For example, I was thinking of adding 2 columns
- update_field ('street_type')
- update_value ('AVENUE')
That way you could construct your own SQL.
link to recorded maps using APN
Say that we are using the EAS user interface to examine the address at APN 1234001.
This APN resolves to 1301 Page St.
Clicking on "Details" will resault in the address details panel showing on map.
From that panel you can click on an APN and see this page:
According to Vivian, this is not exactly what we want to show at this point.
Instead we want to show the recorded map(s).
We think we know what to show and we would like Vivian to confirm.
So generally we think she wants to use this site:
We can hack the URL a bit an provide this:
which returns nothing. We can also just enter the block and show that
From there if you follow the links I think you can (eventually) see the recorded map(s) - for example
On that last link you may need to tell your browser how to handle a tiff image file.
This is not terribly elegant but it might suffice.
provide additional info for certain exceptions
Ideally we would like to fix any bad data we come across.
The exceptions that say:
"insert rejected - address_x_parcels - address geometry is not contained by the specified parcel"
are a bit tricky - but can we do something with these?
Case 1: 0 Alvord (7 exceptions)
What should happen here is that we create 7 base addresses (0 alvord, 2 alvord...12 alvord).
We would have to do slightly fancy spatial processing but it is possible.
SQL would be generated by the ETL have to be applied in AVS and the ETL process run a second time to pick up the cleaned up data.
Case 2:
Need more info here
address_kind
- consider no action
- consider not bringing these data into EAS
please see the issue http://code.google.com/p/eas/issues/detail?id=331
address_type
please see the issue http://code.google.com/p/eas/issues/detail?id=332
structure_number
please see the issue http://code.google.com/p/eas/issues/detail?id=335