Hello Jacob, see below:
julia> Pkg.installed()
Dict{String,VersionNumber} with 25 entries:
"DataFrames" => v"0.8.4"
"DataStreams" => v"0.1.2"
"Calculus" => v"0.1.15"
"Reexport" => v"0.0.3"
"BinDeps" => v"0.4.5"
"Rmath" => v"0.1.4"
"Dates" => v"0.4.4"
"NullableArrays" => v"0.0.10"
"URIParser" => v"0.1.6"
"GZip" => v"0.2.20"
"CSV" => v"0.1.1"
"RDatasets" => v"0.2.0"
"SortingAlgorithms" => v"0.1.0"
"Compat" => v"0.9.3"
"FileIO" => v"0.2.0"
"Distributions" => v"0.11.0"
"DataArrays" => v"0.3.9"
"PDMats" => v"0.5.0"
"SHA" => v"0.2.1"
"StatsBase" => v"0.11.1"
"XGBoost" => v"0.2.0"
"RData" => v"0.0.4"
"WeakRefStrings" => v"0.2.0"
"StatsFuns" => v"0.3.1"
"CategoricalArrays" => v"0.1.0"
On Thursday, November 3, 2016 at 5:19:04 PM UTC-4, Jacob Quinn wrote:
>
> LeAnthony,
>
> I'm wondering if you're on an old version of DataFrames? There haven't
> been any issues "show"-ing DataFrames with NullableArray columns for quite
> some time. You can check (and post back here) your current package versions
> by doing:
>
> Pkg.installed()
>
> You can also ensure you're on the latest valid release by doing:
>
> Pkg.update()
>
>
> -Jacob
>
> On Thu, Nov 3, 2016 at 3:15 PM, Milan Bouchet-Valat <[email protected]
> <javascript:>> wrote:
>
>> Le jeudi 03 novembre 2016 à 13:35 -0700, LeAnthony Mathews a écrit :
>> > Thanks Michael,
>> > I been thinking about this all day. Yes, basically I am going to
>> > have to create a macro CSVreadtable that mimics the readtable
>> > command, but in the expantion uses CSV.read. The macro will manually
>> > constructs a similar readtable sized dataframe array, but use the
>> > column types I specify or inherit from the original readtable
>> > command. The macro can use the current CSV.read parameters.
>> >
>> > So this would work.
>> > df1_CSVreadtable = CSVreadtable("$df1_path"; types=Dict(1=>String))
>> >
>> > so a:
>> > eltypes(df1_CSVreadtable)
>> > 3-element Array{Type,1}:
>> > Int32
>> > String
>> > String
>> >
>> >
>> > Anyway, I was looking for a quick fix, but it least I will learn
>> > some Julia.
>> If you don't have missing values and just want a Vector{String}, you
>> can pass nullable=false to CSV.read().
>>
>>
>> Regards
>>
>> >
>> >
>> > > DataFrames is currently undergoing a very major change. Looks like
>> > > CSV creates the new type of DataFrames. I hope someone can help you
>> > > with using that. As a workaround, on the normal DataFrames version,
>> > > I have generally just replaced with a string representation:
>> > > ```
>> > > df[:account_numbers] = ["$account_number" for account_number in
>> > > df[:account_numbers]]
>> > >
>> > > On Thu, Nov 3, 2016 at 3:05 PM, LeAnthony Mathews <[email protected]
>> > > om> wrote:
>> > > > Sure, so I need col #1 in my CSV to be a string in my data frame.
>> > > >
>> > > >
>> > > > So as a test I tried to load the file 3 different ways:
>> > > >
>> > > > df1_CSV = CSV.read("$df1_path"; types=Dict(1=>String)) #forcing
>> > > > the column to stay a string
>> > > > df1_readtable = readtable("$df1_path") #Do not know how to force
>> > > > the column to stay a string
>> > > > df1_convertDF = convert(DataFrame, df1_CSV)
>> > > >
>> > > > Here is the output: If they are all dataframes then showcols
>> > > > should work an all three df1:
>> > > >
>> > > > julia> names(df1_CSV)
>> > > > 3-element Array{Symbol,1}:
>> > > > :account_number
>> > > > Symbol("Discharge Date")
>> > > > :site
>> > > >
>> > > > julia> names(df1_readtable)
>> > > > 3-element Array{Symbol,1}:
>> > > > :account_number
>> > > > :Discharge_Date
>> > > > :site
>> > > >
>> > > > julia> names(df1_convertDF)
>> > > > 3-element Array{Symbol,1}:
>> > > > :account_number
>> > > > Symbol("Discharge Date")
>> > > > :site
>> > > >
>> > > >
>> > > > julia> eltypes(df1_CSV)
>> > > > 3-element Array{Type,1}:
>> > > > Nullable{String}
>> > > > Nullable{WeakRefString{UInt8}}
>> > > > Nullable{WeakRefString{UInt8}}
>> > > >
>> > > > julia> eltypes(df1_readtable)
>> > > > 3-element Array{Type,1}:
>> > > > Int32 #Do not know how to force the column to stay a string
>> > > > String
>> > > > String
>> > > >
>> > > > julia> eltypes(df1_convertDF)
>> > > > 3-element Array{Type,1}:
>> > > > Nullable{String}
>> > > > Nullable{WeakRefString{UInt8}}
>> > > > Nullable{WeakRefString{UInt8}}
>> > > >
>> > > > julia> showcols(df1_convertDF)
>> > > > 1565x3 DataFrames.DataFrame
>> > > > ERROR: MethodError: no method matching
>> > > > countna(::NullableArrays.NullableArray{St
>> > > > ring,1})
>> > > > Closest candidates are:
>> > > > countna(::Array{T,N}) at
>> > > > C:\Users\lmathews\.julia\v0.5\DataFrames\src\other\ut
>> > > > ils.jl:115
>> > > > countna(::DataArrays.DataArray{T,N}) at
>> > > > C:\Users\lmathews\.julia\v0.5\DataFram
>> > > > es\src\other\utils.jl:128
>> > > > countna(::DataArrays.PooledDataArray{T,R<:Integer,N}) at
>> > > > C:\Users\lmathews\.ju
>> > > > lia\v0.5\DataFrames\src\other\utils.jl:143
>> > > > in colmissing(::DataFrames.DataFrame) at
>> > > > C:\Users\lmathews\.julia\v0.5\DataFram
>> > > > es\src\abstractdataframe\abstractdataframe.jl:657
>> > > > in showcols(::Base.TTY, ::DataFrames.DataFrame) at
>> > > > C:\Users\lmathews\.julia\v0.
>> > > > 5\DataFrames\src\abstractdataframe\show.jl:574
>> > > > in showcols(::DataFrames.DataFrame) at
>> > > > C:\Users\lmathews\.julia\v0.5\DataFrames
>> > > > \src\abstractdataframe\show.jl:581
>> > > >
>> > > > julia> showcols(df1_readtable)
>> > > > 1565x3 DataFrames.DataFrame
>> > > > │ Col # │ Name │ Eltype │ Missing │
>> > > > ├───────┼────────────────┼────────┼─────────┤
>> > > > │ 1 │ account_number │ Int32 │ 0 │
>> > > > │ 2 │ Discharge_Date │ String │ 0 │
>> > > > │ 3 │ site │ String │ 0 │
>> > > >
>> > > > julia> showcols(df1_CSV)
>> > > > 1565x3 DataFrames.DataFrame
>> > > > ERROR: MethodError: no method matching
>> > > > countna(::NullableArrays.NullableArray{St
>> > > > ring,1})
>> > > > Closest candidates are:
>> > > > countna(::Array{T,N}) at
>> > > > C:\Users\lmathews\.julia\v0.5\DataFrames\src\other\ut
>> > > > ils.jl:115
>> > > > countna(::DataArrays.DataArray{T,N}) at
>> > > > C:\Users\lmathews\.julia\v0.5\DataFram
>> > > > es\src\other\utils.jl:128
>> > > > countna(::DataArrays.PooledDataArray{T,R<:Integer,N}) at
>> > > > C:\Users\lmathews\.ju
>> > > > lia\v0.5\DataFrames\src\other\utils.jl:143
>> > > > in colmissing(::DataFrames.DataFrame) at
>> > > > C:\Users\lmathews\.julia\v0.5\DataFram
>> > > > es\src\abstractdataframe\abstractdataframe.jl:657
>> > > > in showcols(::Base.TTY, ::DataFrames.DataFrame) at
>> > > > C:\Users\lmathews\.julia\v0.
>> > > > 5\DataFrames\src\abstractdataframe\show.jl:574
>> > > > in showcols(::DataFrames.DataFrame) at
>> > > > C:\Users\lmathews\.julia\v0.5\DataFrames
>> > > > \src\abstractdataframe\show.jl:581
>> > > >
>> > > >
>> > > >
>> > > > > The result of CSV should be a DataFrame by default. What
>> > > > > return type do you get?
>> > > > >
>> > > >
>> > >
>> > >
>>
>
>