I have corresponded with Javad off-line, posting this as a follow-up, to close
the issue. There are two separate questions here. The first is why did the
posted code below fail. The second is there an easy way to read in the values,
given the oddities of his file (more on that), and yes it turns out "terra"
performs much better than "raster".
First a little about GRIB files in general, and this file in particular. GRIB
files were design to store grids in files with very small footprints, usually
from model output. GRIB files have the characteristic that if I "cat" a bunch
of GRIB files, I still have a valid GRIB file. So a given GRIB file often
contains multiple parameters, and each of those through time. To translate
into terms used by R spatial packages, each variable grid, at each time
period will be seen as a "band" or a "layer" in the dataset. Thus if I have 3
parameters, say temperature, dew point and cloud cover at 8670 time periods,
R spatial packages will see 3*8670 layers or bands.
However this GRIB file is clearly not a raw GRIB file made by a data provider,
but rather an extract made with some tool. In this file there are 3 underlying
time series. temperature, dew point and cloud cover, each of dimension (1, 1,
8670), that is because it is defined at one spatial point only.
So why did the code below fail? raster::stack() read the file just fine, and
the layer_names are correct, all 3*8670 of them. The next steps:
>> # Extract layers based on layer names - adjust as necessary
>> t2m <- raster_data[[grep("t2m", layer_names)]]
>> d2m <- raster_data[[grep("d2m", layer_names)]]
>> tcc <- raster_data[[grep("tcc", layer_names)]]
>> valid_time <- raster_data[[grep("valid_time", layer_names)]]
>> t2m
which are aimed at subsetting the raster stack based on the variable name all
fail because ""t2m" , "d2m" and "tcc" are not in the layer_names. So then of
course everything else fails.
So the next question is how to read the file and get a data frame. A first
pass would be:
raster_stack <- raster::stack("Met.grib")
raster_data <- raster::getValues(raster_stack).
but you do not want to do that. For whatever reason, the raster_stack created
above is enormous, and the raster::getValues() takes forever, I aborted after
about an hour. However, if you use 'terra" instead:
raster_stack <-terra::rast("Met.grib")
raster_data <-terra::values(raster_stack).
it finishes in a flash. raster_data is now one long array that needs to be
reshaped:
raster_data <- array(raster_data, dim = c(3, 8670)
now raster_data[1, ] contains the temperature series, raster_data[2, ]
contains the dew point data, and raster_data[3, ] contains the cloud cover
data.
HTH,
-Roy
PS - GRIB files can be a bear. The best GRIB readers in R are just front ends
for either eccodes or wgrib2, and while they work very well, installation
for eccodes and wgrib2 can be non-trivial, so I didn't want to use them as a
solution. Also for quick viewing of GRIB files there is NASA's Panoply
(https://www.giss.nasa.gov/tools/panoply/) but that requires a Java
installation.
> On Sep 23, 2024, at 11:31 PM, javad bayat <[email protected]> wrote:
>
> Dear R users;
> I have downloaded a grib file format (Met.grib) and I want to export its
> data to excel file. Also I want to do some mathematic on some columns. But
> I got error. I would be more than happy if anyone can help me to do this. I
> have provided the codes and the Met.grib file in this email.
> Sincerely yours
>
> # Load the necessary libraries
>> library(raster) # For reading GRIB files
>> library(dplyr) # For data manipulation
>> library(lubridate) # For date manipulation
>> library(openxlsx) # For writing Excel files
>
> # Specify the file paths
>> grib_file_path <- "C:/Users/Omrab_Lab/Downloads/Met.grib"
>> excel_file_path <- "C:/Users/Omrab_Lab/Downloads/Met_updated.xlsx"
>
> # Open the GRIB file
>> raster_data <- stack(grib_file_path)
>
> # Check the names of the layers to identify which ones to extract
>> layer_names <- names(raster_data)
>> print(layer_names) # Prints
>
>
>> # Extract layers based on layer names - adjust as necessary
>> t2m <- raster_data[[grep("t2m", layer_names)]]
>> d2m <- raster_data[[grep("d2m", layer_names)]]
>> tcc <- raster_data[[grep("tcc", layer_names)]]
>> valid_time <- raster_data[[grep("valid_time", layer_names)]]
>> t2m
> class : RasterStack
> nlayers : 0
>
>> # Check if the raster layers are loaded correctly
>> if (is.null(t2m) || is.null(d2m) || is.null(tcc) || is.null(valid_time))
> {
> + stop("One or more raster layers could not be loaded. Please check the
> layer names.")
> + }
>
>> # Convert raster values to vectors
>> t2m_values <- values(t2m)
> Error in dimnames(x) <- dn :
> length of 'dimnames' [2] not equal to array extent
>> d2m_values <- values(d2m)
> Error in dimnames(x) <- dn :
> length of 'dimnames' [2] not equal to array extent
>> tcc_values <- values(tcc)
> Error in dimnames(x) <- dn :
> length of 'dimnames' [2] not equal to array extent
>> valid_time_values <- values(valid_time)
> Error in dimnames(x) <- dn :
> length of 'dimnames' [2] not equal to array extent
>
> # Check for NA values and dimensions
> if (any(is.na(t2m_values)) || any(is.na(d2m_values)) || any(is.na(tcc_values))
> || any(is.na(valid_time_values))) {
> warning("One or more layers contain NA values. These will be removed.")
> }
>
> # Create the data frame, ensuring no NA values are included
> df <- data.frame(
> t2m = t2m_values,
> d2m = d2m_values,
> tcc = tcc_values,
> valid_time = valid_time_values,
> stringsAsFactors = FALSE
> )
>
> # Remove rows with NA values
> df <- na.omit(df)
>
> # Convert temperatures from Kelvin to Celsius
> df$t2m <- df$t2m - 273.15
> df$d2m <- df$d2m - 273.15
>
> # Calculate relative humidity
> calculate_relative_humidity <- function(t2m, d2m) {
> es <- 6.112 * exp((17.67 * t2m) / (t2m + 243.5))
> e <- 6.112 * exp((17.67 * d2m) / (d2m + 243.5))
> rh <- (e / es) * 100
> return(rh)
> }
> df$RH <- calculate_relative_humidity(df$t2m, df$d2m)
>
> # Convert valid_time from numeric to POSIXct assuming it's in seconds since
> the epoch
> df$valid_time <- as.POSIXct(df$valid_time, origin = "1970-01-01")
>
> # Extract year, month, day, and hour from valid_time
> df$Year <- year(df$valid_time)
> df$Month <- month(df$valid_time)
> df$Day <- day(df$valid_time)
> df$Hour <- hour(df$valid_time)
>
> # Select only the desired columns
> df_selected <- df %>% select(Year, Month, Day, Hour, tcc, t2m, RH)
>
> # Save the updated DataFrame to an Excel file
> write.xlsx(df_selected, excel_file_path, row.names = FALSE)
>
>
>
>
>
>
> --
> Best Regards
> Javad Bayat
> M.Sc. Environment Engineering
> Alternative Mail: [email protected]
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [email protected] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
**********************
"The contents of this message do not reflect any position of the U.S.
Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new street address***
110 McAllister Way
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: [email protected] www: https://www.pfeg.noaa.gov/
"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected"
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.