vignettes/phenotype-services.Rmd
phenotype-services.Rmd
In this example, we’re going to explore the capabilities of the
phenotype-services using the gorr
package.
First load the gorr
package, the tidyverse
package is recommended in general, but not required for this example
First we’ll need to establish a connection to our API services. To do
that we’ll need to call platform_connect
and provide it
with the relevant parameters pointing to the phenotype-catalog-service,
i.e. api_key
and project
:
conn <- platform_connect(api_key = Sys.getenv("GOR_API_KEY"),
project = Sys.getenv("GOR_API_PROJECT"))
conn
#> ── GOR API service connection ──────────────────────────────────────────────────
#> • Service Root/s: https://platform.wuxinextcodedev.com/api/query, https://platform.wuxinextcodedev.com/api/phenotype-catalog, https://platform.wuxinextcodedev.com/queryserver, https://platform.wuxinextcodedev.com/workflow
#> • Project: ukbb_hg38
#> • API key issued at: 2022-05-18 10:18:54
#> • API key expires at: Never
#> • Access token issued at: 2022-06-16 15:56:13
#> • Access token expires at: 2022-06-17 15:56:13
If everything goes as planned, we’ll have a conn
object
to pass into subsequent functions.
Let’s start by fethcing available phenotypes in project (first 25)
phenotypes <- get_phenotypes(conn, limit=25)
print(phenotypes[1:4])
#> $rtest_pheno75
#> ── Phenotype ───────────────────────────────────────────────────────────────────
#> $name: rtest_pheno75
#> $description: This is a test phenotype
#> $result_type: CATEGORY
#> $tag_list:
#> $pn_count:
#> $query:
#>
#> $test_pheno100
#> ── Phenotype ───────────────────────────────────────────────────────────────────
#> $name: test_pheno100
#> $description: This is a test phenotype
#> $result_type: QT
#> $tag_list:
#> $pn_count:
#> $query:
#>
#> $test_pheno67
#> ── Phenotype ───────────────────────────────────────────────────────────────────
#> $name: test_pheno67
#> $description: This is a test phenotype
#> $result_type: QT
#> $tag_list:
#> $pn_count:
#> $query:
#>
#> $test_pheno99
#> ── Phenotype ───────────────────────────────────────────────────────────────────
#> $name: test_pheno99
#> $description:
#> $result_type: CATEGORY
#> $tag_list:
#> $pn_count:
#> $query:
The results come back as a list
of phenotypes
Next, let’s fetch a phenotype from the project. We’ll use the first listed
phenotype <- get_phenotype(name = names(phenotypes)[1], conn)
phenotype
#> ── Phenotype ───────────────────────────────────────────────────────────────────
#> $name: rtest_pheno75
#> $description: This is a test phenotype
#> $result_type: CATEGORY
#> $tag_list:
#> $pn_count:
#> $query:
The results come back as a phenotype
object, which is a
list of lists containing different info on the phenotype
object.
We can add a new phenotype to the project using the
create_phenotype
function as follows
name <- paste0("rtest_pheno", sample(1:99,1)) # Name of new phenotype
result_type <- "CATEGORY" # Type of phenotype (either "QT", "SET" or "CATEGORY")
description <- "This is a test phenotype" # Optional phenotype description
new_phenotype <- create_phenotype(name, result_type, conn, description)
new_phenotype
#> ── Phenotype ───────────────────────────────────────────────────────────────────
#> $name: rtest_pheno50
#> $description: This is a test phenotype
#> $result_type: CATEGORY
#> $tag_list:
#> $pn_count:
#> $query:
A new phenotype has now been added to the project’s phenotype-catalog
as well as been assigned to the variable new_phenotype
.
Phenotypes can also be created using the full set of metadata in a single call and with a GOR/NOR query.
##NOT RUN##
gor_query <- "
def #phenos# = added_salt_to_food;
create ##aggregate## = nor UKBB/phenotypes/fields.nord -s phenotype -f #phenos#
| ATMAX visit_id -gc PN,phenotype
| map -h -m '#empty#' -c phenotype,value <(nor -h UKBB/phenotypes/meta/field_encoding_lookup.tsv)
| REPLACE value if(meaning = '#empty#',value,meaning)
| hide meaning,visit_id,index_id;
nor [##aggregate##] | sort -c PN
| pivot phenotype -ordered -gc PN -v #phenos# -e NA
| rename (.*)_value #{1}
| select PN
"
name2 <- paste0("rtest_pheno", sample(1:99,1))
description2 <- "A set of individuals who answered YES to the question: do you salt your food? https://biobank.ndph.ox.ac.uk/ukb/field.cgi?id=104660"
result_type2 <- "SET"
category2 <- "symptoms"
query2 <- gor_query
tags2 <- "dietary,nutrition"
new_phenotype2 <- create_phenotype(name2, result_type2, conn, description2, query = query2, tags = tags2)
new_phenotype2
phenotype_delete(new_phenotype2, conn)
If we see the phenotype’s state is ‘pending’, or we simply want to
refresh the phenotype for fetching most recent info on the
phenotype,
we can refresh the phenotype using the phenotype_refresh
method
new_phenotype <- phenotype_refresh(new_phenotype, conn)
#> Warning in deprecated_argument_msg(conn) %>% warning(): conn argument deprecated
new_phenotype
#> ── Phenotype ───────────────────────────────────────────────────────────────────
#> $name: rtest_pheno50
#> $description: This is a test phenotype
#> $result_type: CATEGORY
#> $tag_list:
#> $pn_count:
#> $query:
In case the query fails we can call either
phenotype_get_error(phenotype, conn)
, returning latest
error (formatted) or phenotype_get_errors(phenotype, conn)
which returns all errors as lists along with their timestamps.
Passing a phenotype object and a new description to
phenotype_update_description
updates the phenotype’s
description in the project’s phenotype-catalog. The output is, again, an
updated phenotype
object.
new_description <- "An updated description"
new_phenotype <- phenotype_update_description(new_description, new_phenotype, conn)
new_phenotype
#> ── Phenotype ───────────────────────────────────────────────────────────────────
#> $name: rtest_pheno50
#> $description: An updated description
#> $result_type: CATEGORY
#> $tag_list:
#> $pn_count:
#> $query:
Let’s start by creating a dummy dataset which is a list of list: [[PN#1, attribute], [PN#2, attribute], … , [PN#N, attribute]]
cohort_size <- 2000
pns <- as.character(1000:cohort_size+1000)
data <- lapply(pns, function(x) { list(x, if (runif(1)>0.5) 'obease' else 'lean') })
print(data[1:2])
#> [[1]]
#> [[1]][[1]]
#> [1] "2000"
#>
#> [[1]][[2]]
#> [1] "lean"
#>
#>
#> [[2]]
#> [[2]][[1]]
#> [1] "2001"
#>
#> [[2]][[2]]
#> [1] "lean"
phenotype_upload_data(new_phenotype, data, conn)
#> Successfully uploaded phenotype data
Phenotype data can then be fetched using get_data
:
get_data(new_phenotype, conn)
#> Warning in get_data.phenotype(new_phenotype, conn): phenotype structure provided
#> - conn argument ignored
#> # A tibble: 1,001 × 2
#> pn rtest_pheno50
#> <dbl> <chr>
#> 1 2000 lean
#> 2 2001 lean
#> 3 2002 lean
#> 4 2003 obease
#> 5 2004 obease
#> 6 2005 lean
#> 7 2006 lean
#> 8 2007 obease
#> 9 2008 lean
#> 10 2009 lean
#> # … with 991 more rows
We can list all available tags in the catalog using
get_tags
and passing the platform_connection
object.
get_tags(conn)[1:4]
#> [1] "testTag4" "another-test-tag" "test-tag" "Cases"
Add a new tag/s to this phenotype. Multiple tags should be separated
by either comma separated string, "tag1,tag2"
, character
vector c("tag1", "tag2")
or a combination of the two
new_phenotype <- phenotype_add_tag(tag = "testTag2", new_phenotype, conn)
#> Warning: 'phenotype_add_tag' is deprecated.
#> Use 'phenotype_add_tags' instead.
#> See help("Deprecated")
phenotype_get_tags(new_phenotype)
#> [1] "testTag2"
for assurance we can get the phenotype from server and check tags
get_phenotype(name, conn) %>% phenotype_get_tags()
#> [1] "testTag2"
Set the tag list for this phenotype, overriding all previous tags, defining the tags adheres to the same rules as add_phenotype_tag
tags <- "testTag7,testTag8"
new_phenotype <- phenotype_set_tags(tags, new_phenotype, conn)
phenotype_get_tags(new_phenotype)
#> [1] "testTag7" "testTag8"
again, for assurance let’s load the tags from server
get_phenotype(name, conn) %>% phenotype_get_tags()
#> [1] "testTag7" "testTag8"
Delete a tag/s from phenotype.
new_phenotype <- phenotype_delete_tag(tag = "testTag7", new_phenotype, conn)
#> Warning: 'phenotype_delete_tag' is deprecated.
#> Use 'phenotype_delete_tags' instead.
#> See help("Deprecated")
phenotype_get_tags(new_phenotype)
#> [1] "testTag8"
#or get_phenotype(name, conn) %>% phenotype_get_tags()
A phenotype can easily be removed from a project using the
phenotype_delete
function. phenotype_delete
expects a phenotype
object. Therefore the phenotype needs
to be fetched using the get_phenotype
function if it has
not been assigned to a variable already.
phenotype_delete(new_phenotype, conn)
#> Warning in deprecated_argument_msg(conn) %>% warning(): conn argument deprecated
#> Response [https://platform.wuxinextcodedev.com/api/phenotype-catalog/projects/ukbb_hg38/phenotypes/rtest_pheno50]
#> Date: 2022-06-16 15:56
#> Status: 204
#> Content-Type: <unknown>
#> <EMPTY BODY>