Hi Sarthak,

Thank you for you note, but I already wrote:

> Don't wait for anybody with proposal. The new GSoC site is right place to discuss proposals.

So I expected to see and comment, if needed, your proposal on this site. Let me remind you the site - https://summerofcode.withgoogle.com/

Best regards,
    Dmitry

25.03.2016 10:17, sarthak agarwal пишет:
The deadline is today.

Sarthak

On Thu, Mar 24, 2016 at 1:52 AM, sarthak agarwal <sarthak0...@gmail.com <mailto:sarthak0...@gmail.com>> wrote:

    Hello Dmitry,

    I fixed the bug (I guess).
    Now coming to my proposal for GSoC, So I was thinking of working
    on project #4 *Auto-detection of EPSG codes from incomplete WKT.*

    What I understood from the project is that we need to predict the
    EPSG code of certain files on the basis of some attributes which
    are available in the file.

    The attributes can be extracted from the file for which I read
    this
    <http://www.gdal.org/osr_tutorial.html#querying_coordinate_system>.

    Now to solve this problem I thought a lot of methods but I think
    the best way to solve it will be using machine learning.

    The way ML will handle this problem is as follows-

     1. We need to find the EPSG code for a file (testing data)
     2. We have a file with some attributes (projections,datum,etc ).
     3. We need to the guess the best suitable class for that file(EPSG)
     4. Also, we have many files for which we know the attributes and
        the corresponding class (training data).

    This problem is now translated into an ML problem which can be
    solved using the following models-

    1. Bayesian Stastics
    <https://en.wikipedia.org/wiki/Posterior_probability>

        where,
        posteriror probability = probability of this file have EPSG
        code 'a'.
        prior probability = probability of occurence of EPSG code 'a'.

        likelihood probablity = cases where we saw such attributes
        when the EPSG code is 'a'.


    2. or we can use a simple knn where k is the number of possible
    EPSG code and the dimension of the feature vector is the number of
    possible attributes. we need to the find a valid and promising
    weight function).


    3. We can use multi-class SVM.

    4. any other suggestion from the community regarding the possible
    choice of the algo.

    I am thinking of actually implementing all these algo(may add algo
    in future depending upon the suggestion) and select the algo which
    gives the best performance among all of them.

    Please provide me feedback on my proposal and suggestion if I can
    add/change anything.
    And since very less time is left in the deadline, I would like to
    convert it into proposal ASAP with your help.

    Regards,
    Sarthak

    ​

_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to