[
https://issues.apache.org/jira/browse/PIG-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000968#comment-14000968
]
Philip (flip) Kromer commented on PIG-3877:
-------------------------------------------
* This makes separate HTTP calls for the latitude, then the longitude. Better
to have one method that returns a tuple prepared from the fully-parsed reponse
and let the caller project what they want.
* What happens on a response that fails to geocode or for any other reason
doesn't have a latLng element? the JSONObject latLng = (JSONObject)
((JSONObject)locations.get(0)).get("latLng"); geolongitude = (String)
latLng.get("lng"); sequence feels like a recipe for NPE.
* Is the intuit backend ready for people who might use this in production? Or
even for apache and the world's automated build systems to hit it without
standing as abusive?
* I worry about having Pig make a network call on every record. There's no
facility for throttling, backoff, or HTTP keep-alive.
* Even with those, the only way I can imagine to make this workable at
production scale using an over-the-network geocoder would be to deploy an
instance on each machine. Pete Warden's [Data Science
Toolkit|http://petewarden.com/2013/10/06/geocode-the-world-with-the-new-data-science-toolkit/]
has a [Standalone
Geocoder|http://www.datasciencetoolkit.org/developerdocs#googlestylegeocoder];
this should target that and refer to it (or acceptable alternative) in the docs.
> Getting Geo Latitude/Longitude from Address Lines
> -------------------------------------------------
>
> Key: PIG-3877
> URL: https://issues.apache.org/jira/browse/PIG-3877
> Project: Pig
> Issue Type: Improvement
> Components: piggybank
> Affects Versions: 0.10.1
> Reporter: Rekha Joshi
> Assignee: Rekha Joshi
> Labels: patch, piggybank
> Fix For: 0.10.1
>
> Attachments: PIG-3877.1.patch
>
>
> In many datasets mining use cases, it is needed to get latitude, longitude
> just from address lines.The IP fields are missing.
> The Attached udfs for getting the geo latitude/longitude on address lines.
--
This message was sent by Atlassian JIRA
(v6.2#6252)