Hello Ralph, this worked.
I changed added the eltypes option to force the readtable command to read
the first column in as a string type rather than a destructive int32.
*df1_readtable_old = readtable("$df1_path")*
*df1_readtable_new = readtable("$df1_path", eltypes=[String,String,String])*
julia> *eltypes(df1_readtable_old)*
3-element Array{Type,1}:
Int32
String
String
julia>* eltypes(df1_readtable_new)*
3-element Array{Type,1}:
* String*
String
String
Thanks everyone for the support.
julia>
On Thursday, November 3, 2016 at 11:29:53 PM UTC-4, Ralph Smith wrote:
>
> Unless I misunderstand,
>
> df1 = readtable(file1,eltypes=[String,String,String])
>
>
> seems to be what you want.
>
> If you're new to Julia, the fact that a "vector of types" really means
> exactly that may be surprising.
>
> Let us hope that the new versions of DataFrames include a parser that
> doesn't treat most 10-digit numbers as Int32 on systems like yours.
>
> On Wednesday, November 2, 2016 at 4:15:20 PM UTC-4, LeAnthony Mathews
> wrote:
>>
>> Spoke too soon.
>> Again I simple want the CSV column that is read in to not be an int32,
>> but a string.
>>
>> Still having issues casting the CSV file back into a Dataframe.
>> Its hard to understand why the Julia system is attempting to determine
>> the type of the columns when I use readtable and I have no control over
>> this.
>>
>> Why can I not say:
>> df1 = readtable(file1; types=Dict(1=>String)) # assuming your account
>> number is column # 1
>>
>> *Reading the Julia spec-Advanced Options for Reading CSV Files*
>> *readtable accepts the following optional keyword arguments:*
>>
>> *eltypes::Vector{DataType} – Specify the types of all columns. Defaults
>> to [].*
>>
>>
>> *df1 = readtable(file1, Int32::Vector(String))*
>>
>> I get
>> *ERROR: TypeError: typeassert: expected Array{String,1}, got Type{Int32}*
>>
>> Is this even an option? Or how about convert the df1_CSV to
>> df1_dataframe?
>> *df1_dataframe = convert(dataframe, df1_CSV)*
>> Since the CSV .read seems to give more granular control.
>>
>>
>> On Tuesday, November 1, 2016 at 7:28:36 PM UTC-4, LeAnthony Mathews wrote:
>>>
>>> Great, that worked for forcing the column into a string type.
>>> Thanks
>>>
>>> On Monday, October 31, 2016 at 3:26:14 PM UTC-4, Jacob Quinn wrote:
>>>>
>>>> You could use CSV.jl: http://juliadata.github.io/CSV.jl/stable/
>>>>
>>>> In this case, you'd do:
>>>>
>>>> df1 = CSV.read(file1; types=Dict(1=>String)) # assuming your account
>>>> number is column # 1
>>>> df2 = CSV.read(file2; types=Dict(1=>String))
>>>>
>>>> -Jacob
>>>>
>>>>
>>>> On Mon, Oct 31, 2016 at 12:50 PM, LeAnthony Mathews <[email protected]
>>>> > wrote:
>>>>
>>>>> Using v0.5.0
>>>>> I have two different 10,000 line CSV files that I am reading into two
>>>>> different dataframe variables using the readtable function.
>>>>> Each table has in common a ten digit account_number that I would like
>>>>> to use as an index and join into one master file.
>>>>>
>>>>> Here is the account number example in the original CSV from file1:
>>>>> 8018884596
>>>>> 8018893530
>>>>> 8018909633
>>>>>
>>>>> When I do a readtable of this CSV into file1 then do a*
>>>>> typeof(file1[:account_number])* I get:
>>>>> *DataArrays.DataArray(Int32,1)*
>>>>> -571049996
>>>>> -571041062
>>>>> -571024959
>>>>>
>>>>> when I do a
>>>>> *typeof(file2[:account_number])*
>>>>> *DataArrays.DataArray(String,1)*
>>>>>
>>>>>
>>>>> *Question: *
>>>>> My CSV files give no guidance that account_number should be Int32 or
>>>>> string type. How do I force it to make both account_number elements type
>>>>> String?
>>>>>
>>>>> I would like this join command to work:
>>>>> *new_account_join = join(file1, file2, on =:account_number,kind =
>>>>> :left)*
>>>>>
>>>>> But I am getting this error:
>>>>> *ERROR: TypeError: typeassert: expected Union{Array{Symbol,1},Symbol},
>>>>> got Array{*
>>>>> *Array{Symbol,1},1}*
>>>>> * in (::Base.#kw##join)(::Array{Any,1}, ::Base.#join,
>>>>> ::DataFrames.DataFrame, ::D*
>>>>> *ataFrames.DataFrame) at .\<missing>:0*
>>>>>
>>>>>
>>>>> Any help would be appreciated.
>>>>>
>>>>>
>>>>>
>>>>