DBI Intgration Issues
I want to review the issues we have with the integration so far.
All of this is my best guess - I invite everyone to comment.
Suffixed Street Numbers
This is when the street number is followed by an additional element such as the "1/2" in "100 1/2 Main St".
But it may well also be the "A" in "100 A Main".
Data
EAS uses char(10) for this field ( base_address_suffix character varying(10))
The FGDC standard simply specifies character; no length is mentioned but the examples are all 1-3 chars.
I think I found 35 addresses in our data with a "1/2" that belongs in the address number suffix.
It appears that they have been mis-parsed into the unit_num - I will look at this with Adam today.
DBI may want to classify the "1/2" addresses as unofficial - DBI will have to comment.
Issues
AVS stores the street number suffix in a char(1) - therefore the change notification messaging has a bug.
This is potentially a difficult issue because a number legacy systems and reports rely on this char(1) field.
http://sfgovdt.jira.com/browse/MAD-122
Unit
Let's go right to an example.
"100 Main St. Apartment A"
In this example "Apartment" is the unit type and "A" is the Unit Number.
Other examples include:
"100 Main St. Suite 100"
"100 Main St. Basement"
"100 Main St. #3"
FGDC draft 2:
unit_type (char - no length specified)
unit_identifier (char)
FGDC final draft:
subaddress_type (char - no length specified)
subaddress_identifier (char - no length specified)
This draft gets more complicated where a single address might include "Bldg 7, Floor 2, Suite 100", but I think this is off topic.
AVS:
UNIT(numeric)
UNIT_SFX char(10)
EAS:
unit_num_prefix character varying(5),
unit_num character varying(20) DEFAULT (-1),
unit_num_suffix character varying(10),
EAS does not have any values in these columns.
unit_num_suffix
unit_num_prefix
We should debate dropping these columns.
Issues
The problem we will run into is, again, in the change notification, where AVS and EAS store the data differently as shown above.
At the moment, I suspect AVS will stack trace when it tries to take an EAS unit_num (char) and put it into its unit number (numeric) field.
http://sfgovdt.jira.com/browse/MAD-171
See the issue and the attachment for all the details.
In summary EAS has unit_nums such as:
- 1
- A1
- 1A
- A
Summary
We'll have to look at each integration issue as it comes up and study the legacy system(s) carefully before we even propose solutions.
I guess that part is obvious.
As a rule of thumb with EAS, should we decide to make changes of any sort, I will argue that we should move closer to the FGDC final draft standard.
What does this mean in concrete terms?
Take http://sfgovdt.jira.com/browse/MAD-171 as an example.
Given the existing EAS data model, an anal retentive (such as myself) might want to split out the character prefixes and suffixes into the existing columns:
unit_num_suffix
unit_num_prefix
But this moves us further away from the FGDC standard rather than closer to it.
I would argue that instead we drop those columns - making our model more consistent with the standard.
In this case, this leaves us with a bit more integration work with AVS - we'll have to add some parsing and some if statements to the consumer of the change notification messages.
But in the long run, following a published standard should be more sustainable, especially since we anticipate supporting multiple agencies with various use cases.
And for those of you who have not yet had enough punishment the FGDC standards are available at these URLs
second draft:
http://www.fgdc.gov/standards/projects/FGDC-standards-projects/street-address/05-11.2ndDraft.CompleteDoc.pdf
final draft
http://www.fgdc.gov/standards/projects/FGDC-standards-projects/street-address/AddressStandardFINAL.zip