r - Parametrized queries in DBI when connecting to BigQuery - Stack Overflow

admin2025-04-19  0

I would like to run parametrized queries in BigQuery passing both the body and the parameters to an appropriate DBI function within R. So fo example, if I have a query like

query <- "SELECT * FROM table WHERE param = @my_param"

I would like to be able to do

result <- dbGetQuery(conn, query, params = list(my_param = "value"))

However, this seems to fail and I get an error from BigQuery saying that the parameteer my_param wasn't found - as if the argument "params" passed to dbGetQuery had been ignored.

I have searched through the DBI and BigQuery documentation but they are either elusive (the first) and not provide support for the R's interface (the second).

I'd rather use Python if I could but I am bound to use R. Does anyone know what is the correct specification to pass named parameters to BigQuery?

On a side note, I have also tried the positional argument approach described here for other platform but my statement got rejected with an error saying that "positional parameters are not supported".

I would like to run parametrized queries in BigQuery passing both the body and the parameters to an appropriate DBI function within R. So fo example, if I have a query like

query <- "SELECT * FROM table WHERE param = @my_param"

I would like to be able to do

result <- dbGetQuery(conn, query, params = list(my_param = "value"))

However, this seems to fail and I get an error from BigQuery saying that the parameteer my_param wasn't found - as if the argument "params" passed to dbGetQuery had been ignored.

I have searched through the DBI and BigQuery documentation but they are either elusive (the first) and not provide support for the R's interface (the second).

I'd rather use Python if I could but I am bound to use R. Does anyone know what is the correct specification to pass named parameters to BigQuery?

On a side note, I have also tried the positional argument approach described here for other platform but my statement got rejected with an error saying that "positional parameters are not supported".

Share Improve this question asked Mar 5 at 16:05 CaligulaCaligula 1053 bronze badges 3
  • I apologize that I cannot check BigQuery specifically, but most DBI connections use a singular ? (not named) for parameter passing. Have you tried "select * from table where param=? and params=list("value")? – r2evans Commented Mar 5 at 16:52
  • @r2evans yes, I did try that and BigQuery returned the error "positional parameters are not allowed". I read somewhere in the DBI documentation that named parameters are allowed depending on the DBMS/service you're connecting to but I cannot find the page anymore. Still it didn't mention BigQuery. – Caligula Commented Mar 5 at 17:46
  • Yes, it varies. Every. Single. Provider. Changes. Things. Sqlite, for instance, supports ?, ?num, or :name. Postgres supports (at least at some point) $1 (1-based, sequential). SQL Server always uses ?. And all of those DBMSes through the odbc package and an ODBC connection only supports the single ?. I find it frustrating that there had been so much effort to get away from positional into named places to justify glue, but neither DBI nor dbx devs think it is useful to standardize/protect queries in a similar fashion. Sry, I've got nothing for you here. – r2evans Commented Mar 5 at 19:47
Add a comment  | 

2 Answers 2

Reset to default 0

I thing the main issue in your code is the way how you are constructing the query. Some time ago I faced the same problem, and I put everything inside a function. In this case the function access to a DBI dataset with several columns. In this case I'm sampling three columns (date, column_01, column_02). Here is a simplify version of the code that works for me.

# Create the function with parameters

runQuery <- function(DB_path, column_name, filters = list(), filter_logic = list()) {
  library(DBI)
  library(duckdb)
  library(glue)
  
  con <- dbConnect(duckdb::duckdb(), ":memory:")
  on.exit(dbDisconnect(con, shutdown = TRUE), add = TRUE)
  
  where_clauses <- lapply(names(filters), function(filter_name) {
    filter_value <- filters[[filter_name]]
    if (is.null(filter_value)) return(NULL)
    
    if (filter_name == "date" && is.list(filter_value)) {
      year_clauses <- c()
      params <- list()
      if (!is.null(filter_value$min)) {
        year_clauses <- c(year_clauses, "CAST(year(date) AS INTEGER) >= ?")
        params <- c(params, filter_value$min)
      }
      if (!is.null(filter_value$max)) {
        year_clauses <- c(year_clauses, "CAST(year(date) AS INTEGER) <= ?")
        params <- c(params, filter_value$max)
      }
      if (length(year_clauses) > 0) {
        return(list(clause = paste(year_clauses, collapse = " AND "), params = params))
      } else {
        return(NULL)
      }
      
    } else if (is.vector(filter_value)) {
      return(list(clause = "regexp_matches(column_01, ?, 'i')", params = paste(filter_value, collapse = "|")))
      
    } else if (is.character(filter_value)) {
      return(list(clause = "regexp_matches(column_02, ?, 'i')", params = filter_value))
      
    } else if (is.numeric(filter_value)) {
      return(list(clause = glue("{filter_name} = ?"), params = filter_value))
    }
    return(NULL)
  })
  
  where_clauses <- where_clauses[!sapply(where_clauses, is.null)] 
  if (length(where_clauses) == 0) {
    where_clause <- ""
    params <- list()
  } else {
    combined_clauses <- sapply(where_clauses, function(x) x$clause)
    params <- unlist(lapply(where_clauses, function(x) x$params))
    
    where_clause <- paste("WHERE", paste(combined_clauses, collapse = " AND "))
  }
  
  query <- glue("SELECT {column_name} FROM read_parquet('{DB_path}') {where_clause};")
  
  tryCatch({
    result <- dbGetQuery(con, query, params)
    return(result)
  }, error = function(e) {
    stop("Error executing query: ", e$message)
  })
}


# Define your parameters
filters_list <- list(
  date = list(min = 2020, max = 2022),
  column_01 = c("keyword 1", "keyword 2", "keyword 3"),
  column_02 = c("keyword 4", "keyword 5")
)

# Run the query! 
result <- runQuery(
  DB_path = "path to the DBI database",
  column_name = "*",
  filters = filters_list
) |> tibble::as_tibble()

I figured it out and I'm posting an answer in case anyone might find it useful in the future.

It seems like the methods dbSemdQiery and dbFetch work better when interfacing with BigQuery. As a matter of fact, dbGetQuery calls them but for some reasons it doesn't handle parameters.

The following code works as expected for me:

library(DBI)
library(bigrquery)

conn <- dbConnect(...)
query <- "SELECT * FROM `table_id` WHERE col_1 = @par_1"

# send the query to BigQuery
res <- dbSendQuery(
    conn, 
    statement = query, 
    params = list("par_1" = 2)
)

# fetch the result set, binding the parameters
dbFetch(res)

DBI maps R data types to BQ data types but only to a certain extent so caurion must be exercised.

转载请注明原文地址:http://conceptsofalgorithm.com/Algorithm/1745021891a280429.html

最新回复(0)