I would like to run parametrized queries in BigQuery passing both the body and the parameters to an appropriate DBI function within R. So fo example, if I have a query like
query <- "SELECT * FROM table WHERE param = @my_param"
I would like to be able to do
result <- dbGetQuery(conn, query, params = list(my_param = "value"))
However, this seems to fail and I get an error from BigQuery saying that the parameteer my_param wasn't found - as if the argument "params" passed to dbGetQuery had been ignored.
I have searched through the DBI and BigQuery documentation but they are either elusive (the first) and not provide support for the R's interface (the second).
I'd rather use Python if I could but I am bound to use R. Does anyone know what is the correct specification to pass named parameters to BigQuery?
On a side note, I have also tried the positional argument approach described here for other platform but my statement got rejected with an error saying that "positional parameters are not supported".
I would like to run parametrized queries in BigQuery passing both the body and the parameters to an appropriate DBI function within R. So fo example, if I have a query like
query <- "SELECT * FROM table WHERE param = @my_param"
I would like to be able to do
result <- dbGetQuery(conn, query, params = list(my_param = "value"))
However, this seems to fail and I get an error from BigQuery saying that the parameteer my_param wasn't found - as if the argument "params" passed to dbGetQuery had been ignored.
I have searched through the DBI and BigQuery documentation but they are either elusive (the first) and not provide support for the R's interface (the second).
I'd rather use Python if I could but I am bound to use R. Does anyone know what is the correct specification to pass named parameters to BigQuery?
On a side note, I have also tried the positional argument approach described here for other platform but my statement got rejected with an error saying that "positional parameters are not supported".
I thing the main issue in your code is the way how you are constructing the query. Some time ago I faced the same problem, and I put everything inside a function. In this case the function access to a DBI dataset with several columns. In this case I'm sampling three columns (date, column_01, column_02). Here is a simplify version of the code that works for me.
# Create the function with parameters
runQuery <- function(DB_path, column_name, filters = list(), filter_logic = list()) {
library(DBI)
library(duckdb)
library(glue)
con <- dbConnect(duckdb::duckdb(), ":memory:")
on.exit(dbDisconnect(con, shutdown = TRUE), add = TRUE)
where_clauses <- lapply(names(filters), function(filter_name) {
filter_value <- filters[[filter_name]]
if (is.null(filter_value)) return(NULL)
if (filter_name == "date" && is.list(filter_value)) {
year_clauses <- c()
params <- list()
if (!is.null(filter_value$min)) {
year_clauses <- c(year_clauses, "CAST(year(date) AS INTEGER) >= ?")
params <- c(params, filter_value$min)
}
if (!is.null(filter_value$max)) {
year_clauses <- c(year_clauses, "CAST(year(date) AS INTEGER) <= ?")
params <- c(params, filter_value$max)
}
if (length(year_clauses) > 0) {
return(list(clause = paste(year_clauses, collapse = " AND "), params = params))
} else {
return(NULL)
}
} else if (is.vector(filter_value)) {
return(list(clause = "regexp_matches(column_01, ?, 'i')", params = paste(filter_value, collapse = "|")))
} else if (is.character(filter_value)) {
return(list(clause = "regexp_matches(column_02, ?, 'i')", params = filter_value))
} else if (is.numeric(filter_value)) {
return(list(clause = glue("{filter_name} = ?"), params = filter_value))
}
return(NULL)
})
where_clauses <- where_clauses[!sapply(where_clauses, is.null)]
if (length(where_clauses) == 0) {
where_clause <- ""
params <- list()
} else {
combined_clauses <- sapply(where_clauses, function(x) x$clause)
params <- unlist(lapply(where_clauses, function(x) x$params))
where_clause <- paste("WHERE", paste(combined_clauses, collapse = " AND "))
}
query <- glue("SELECT {column_name} FROM read_parquet('{DB_path}') {where_clause};")
tryCatch({
result <- dbGetQuery(con, query, params)
return(result)
}, error = function(e) {
stop("Error executing query: ", e$message)
})
}
# Define your parameters
filters_list <- list(
date = list(min = 2020, max = 2022),
column_01 = c("keyword 1", "keyword 2", "keyword 3"),
column_02 = c("keyword 4", "keyword 5")
)
# Run the query!
result <- runQuery(
DB_path = "path to the DBI database",
column_name = "*",
filters = filters_list
) |> tibble::as_tibble()
I figured it out and I'm posting an answer in case anyone might find it useful in the future.
It seems like the methods dbSemdQiery
and dbFetch
work better when interfacing with BigQuery. As a matter of fact, dbGetQuery
calls them but for some reasons it doesn't handle parameters.
The following code works as expected for me:
library(DBI)
library(bigrquery)
conn <- dbConnect(...)
query <- "SELECT * FROM `table_id` WHERE col_1 = @par_1"
# send the query to BigQuery
res <- dbSendQuery(
conn,
statement = query,
params = list("par_1" = 2)
)
# fetch the result set, binding the parameters
dbFetch(res)
DBI maps R data types to BQ data types but only to a certain extent so caurion must be exercised.
?
(not named) for parameter passing. Have you tried"select * from table where param=?
andparams=list("value")
? – r2evans Commented Mar 5 at 16:52?
,?num
, or:name
. Postgres supports (at least at some point)$1
(1-based, sequential). SQL Server always uses?
. And all of those DBMSes through the odbc package and an ODBC connection only supports the single?
. I find it frustrating that there had been so much effort to get away from positional into named places to justify glue, but neither DBI nor dbx devs think it is useful to standardize/protect queries in a similar fashion. Sry, I've got nothing for you here. – r2evans Commented Mar 5 at 19:47