Package 'handyFunctions' reference manual

Title:	Useful Functions for Handfully Manipulating and Analyzing Data with Data.frame Format
Description:	Some useful functions for simply manipulating and analyzing data with data.frame format. It mainly includes the following sections: ReformatDataframe (reformat dataframe with the modifiers), InteractDataframe, and Post-VCF (for downstream analysis for data generated from vcftools (Petr Danecek, 2011) (<http://dx.doi.org/10.1093/bioinformatics/btr330>) or plink (Chang CC, 2015) <10.1186/s13742-015-0047-8>.
Authors:	Hongfei Liu
Maintainer:	Hongfei Liu <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.0
Built:	2025-01-25 06:25:45 UTC
Source:	https://github.com/luffylouis/handyfunctions

check the validation and return index of cols given from input in rawDataFrame

Description

check the validation and return index of cols given from input in rawDataFrame

Usage

checkCols(rawDataFrame, cols)
checkCols(rawDataFrame, cols)

Arguments

`rawDataFrame`	raw data.frame
`cols`	specific cols given from input

Value

return validation (only FALSE if invaild cols input) or index of cols

Examples

library(handyFunctions)
data(people)
checkCols(people, c("..name", "..sex"))
# OR
checkCols(people, c(1, 2))
library(handyFunctions)
data(people)
checkCols(people, c("..name", "..sex"))
# OR
checkCols(people, c(1, 2))

Return suggested dtype of vector input

Description

Return suggested dtype of vector input

Usage

checkDtype(vector)
checkDtype(vector)

Arguments

vector

vector/list input

Value

Return suggested dtypes of vector

Examples

library(handyFunctions)
vector <- c(1, 2, 3, "", NA, "  ", "four", "NA", 5)
checkDtype(vector)

library(handyFunctions)
vector <- c(1, 2, 3, "", NA, "  ", "four", "NA", 5)
checkDtype(vector)

Grade records of virtual persons in high school

Description

A dataset containing the personal grade information (chinese, math, english, physics, biology, chemistry) of virtual persons.

Usage

grade
grade

Format

A data frame with 6 rows and 7 variables:

name: name, chinese or foreigner, in carats
chinese: grade of the chinese, in numbers
math: grade of the math, in numbers
english: grade of the english, in numbers
physics: grade of the physics, in numbers
biology: grade of the biology, in numbers
chemistry: grade of the chemistry, in numbers

...

Source

"simulated dataset"

Return the index of source vector matched with query vector

Description

Return the index of source vector matched with query vector

Usage

matchIndex(SourceInfo, queryInfo, queryType = TRUE)
matchIndex(SourceInfo, queryInfo, queryType = TRUE)

Arguments

`SourceInfo`	the source vector
`queryInfo`	the query vector
`queryType`	logical If set it to accurate match (default: TRUE)

Value

the index of source vector matched with query vector

Examples

library(handyFunctions)
data(grade)
matchIndex(grade[, "name"], c("Ming Li", "Bang Wei"))

library(handyFunctions)
data(grade)
matchIndex(grade[, "name"], c("Ming Li", "Bang Wei"))

merge two data.frame based on xcol and ycol

Description

merge two data.frame based on xcol and ycol

Usage

mergeCustom(x, y, xcol, ycol)
mergeCustom(x, y, xcol, ycol)

Arguments

`x`	the first data.frame
`y`	the second data.frame
`xcol`	colnames which you want to merged in first data.frame
`ycol`	colnames which you want to merged in second data.frame

Value

return the new data.frame merged

Examples

library(handyFunctions)
data(people)
data(grade)
mergeCustom(people, grade, "..name", "name")

library(handyFunctions)
data(people)
data(grade)
mergeCustom(people, grade, "..name", "name")

Return reformatted data.frame with standard col names

Description

Return reformatted data.frame with standard col names

Usage

modifyColNames(rawDataFrame, cols = TRUE, rawSep = "..", sep = "_")
modifyColNames(rawDataFrame, cols = TRUE, rawSep = "..", sep = "_")

Arguments

`rawDataFrame`	Raw data.frame input
`cols`	Specific col names or indexes what you want to reformat (default: TRUE, use all cols)
`rawSep`	Raw odd separation symbol in col names of raw data.frame. Note: it supports regEx (regular expression), so "." means all possible symbols. If you want to use the "." dot notation, please use "[.]".
`sep`	Separation symbol in col names of modified data.frame

Value

A modified data.frame with col names separated by your given delimitator

Examples

library(handyFunctions)
data(people)
modified_people <- modifyColNames(people,rawSep = "[.][.]")

library(handyFunctions)
data(people)
modified_people <- modifyColNames(people,rawSep = "[.][.]")

Return suggested appropriate dtypes for each column in rawDataFrame

Description

Return suggested appropriate dtypes for each column in rawDataFrame

Usage

modifyColTypes(rawDataFrame, cols = TRUE, dtype = FALSE, custom = FALSE)
modifyColTypes(rawDataFrame, cols = TRUE, dtype = FALSE, custom = FALSE)

Arguments

`rawDataFrame`	Raw data.frame
`cols`	Specify cols which you want to change its dtypes when custom is FALSE (default: TRUE, for all cols)
`dtype`	Specify indexed matched dtypes whcih you want to update when custom is FALSE (default: FALSE, for automatically update)
`custom`	Option whether set to auto/custom , you can specify your custom dtypes for cols given when setting to TRUE (default: FALSE, for auto)

Value

Return a new data.frame with appropriate dtypes suggested for each cols

Examples

library(handyFunctions)
data(people)
modifyColTypes(people)

library(handyFunctions)
data(people)
modifyColTypes(people)

Return reformatted data.frame with standard row names

Description

Return reformatted data.frame with standard row names

Usage

modifyRowNames(rawDataFrame, rows = TRUE, rawSep = "..", sep = "_")
modifyRowNames(rawDataFrame, rows = TRUE, rawSep = "..", sep = "_")

Arguments

`rawDataFrame`	Raw data.frame input
`rows`	Specific row names or indexes what you want to reformat (default: TRUE, use all row)
`rawSep`	Raw odd separation symbol in row names of raw data.frame. Note: it supports regEx (regular expression), so "." means all possible symbols. If you want to use the "." dot notation, please use "[.]".
`sep`	Separation symbol in row names of modified data.frame

Value

A modified data.frame with row names separated by your given delimitator

Examples

library(handyFunctions)
data(people)
modifyRowNames(people)

library(handyFunctions)
data(people)
modifyRowNames(people)

Basic information of virtual persons

Description

A dataset containing the personal basic information (name, sex, age, and death_age) of virtual persons.

Usage

people
people

Format

A data frame with 6 rows and 4 variables:

..name: name, chinese or foreigner, in carats
..sex: sex of the person, in carats
..age: living age in final record, in numbers
..death..age: final age when a person is dead, in numbers

...

Source

"simulated dataset"

return index of x data.frame with the given vector/list or ycol in data.frame (if set the accurate match or not)

Description

return index of x data.frame with the given vector/list or ycol in data.frame (if set the accurate match or not)

Usage

queryingInfo(SourceData, sourceCol, queryCol, queryInfo, queryType = TRUE)
queryingInfo(SourceData, sourceCol, queryCol, queryInfo, queryType = TRUE)

Arguments

`SourceData`	the source data.frame which you want to query
`sourceCol`	the col names or index of query field in source data.frame
`queryCol`	the col names or index of return field in source data.frame
`queryInfo`	vector/list the query info
`queryType`	logical if set it to accurate match (default: TRUE)

Value

a vector in query field matched with query info in source data

Examples

library(handyFunctions)
data(grade)
queryingInfo(grade, "name", "chinese", c("Ming Li", "Bang Wei"))

library(handyFunctions)
data(grade)
queryingInfo(grade, "name", "chinese", c("Ming Li", "Bang Wei"))

Function of showing SNP density at chromosome level

Description

Function of showing SNP density at chromosome level

Usage

ShowSNPDensityPlot(
  densityData,
  binSize,
  densityColorBar = c("grey", "darkgreen", "yellow", "red"),
  chromSet = c(1:22),
  withchr = FALSE
)
ShowSNPDensityPlot(
  densityData,
  binSize,
  densityColorBar = c("grey", "darkgreen", "yellow", "red"),
  chromSet = c(1:22),
  withchr = FALSE
)

Arguments

`densityData`	the raw density data generated from vcftools
`binSize`	the bin size set while generating density data
`densityColorBar`	vector Specific the color bar for plotting density plot (generally four colors)
`chromSet`	vector Filtered chrom set which you want to plot (it must be matched with the CHROM column in densityData)
`withchr`	logical If the chromsome labels of density plot is prefixed with "chr". Note: it cannot work when the filtered chrom set contain other uncommon chrom symbols (e.g. NC0*, etc)

Value

A ggplot2 object for SNP density plot

Examples

library(handyFunctions)
data(SNV_1MB_density_data)
ShowSNPDensityPlot(SNV_1MB_density_data, binSize = 1e6, chromSet = c(38:1))

library(handyFunctions)
data(SNV_1MB_density_data)
ShowSNPDensityPlot(SNV_1MB_density_data, binSize = 1e6, chromSet = c(38:1))

The SNPV number within 1Mb bins at chromosome levels generated from transcriptome dataset of two dog populations (including wild wolf and domesticated dogs).

Description

A dataset containing the SNV number within 1Mb bins called from transcriptome dataset of wild wolf and domesticated dogs.

Usage

SNV_1MB_density_data
SNV_1MB_density_data

Format

A data frame with 2544 rows and 4 variables:

CHROM: chrom id, reference genome of CanFam3.1, in numbers/carats
BIN_START: the start genomic coordinate for one bin at relevant chromosome, in numbers
SNP_COUNT: the end genomic coordinate for one bin at relevant chromosome, in numbers
VARIANTS.KB: SNV(variants) number within one bin per KB, in numbers

...

Source

"real dataset"

Return specific-indexed vector according to given delimitator/separator by splitting one col in data.frame

Description

Return specific-indexed vector according to given delimitator/separator by splitting one col in data.frame

Usage

splitCol(data, col = FALSE, sep, index, fixed = TRUE)
splitCol(data, col = FALSE, sep, index, fixed = TRUE)

Arguments

`data`	vector or data.frame input
`col`	the col names or indexes if data.frame input
`sep`	separation deliminator
`index`	the index of symbol which you want
`fixed`	logical. If TRUE match split exactly, otherwise use regular expressions, detailed info can be seen in strsplit.

Value

specific-indexed vector or factor

Examples

library(handyFunctions)
data(people)
splitCol(people, col = 1, sep = " ", index = 2)

library(handyFunctions)
data(people)
splitCol(people, col = 1, sep = " ", index = 2)

Reformat dataframe with the all modifiers simultaneously (colNames, rowNames and dtypes)

Description

Reformat dataframe with the all modifiers simultaneously (colNames, rowNames and dtypes)

Usage

unifyDataframe(
  rawDataFrame,
  rawRowSep = "..",
  rowSep = "_",
  rawColSep = "..",
  colSep = "_",
  changeDtype = TRUE
)
unifyDataframe(
  rawDataFrame,
  rawRowSep = "..",
  rowSep = "_",
  rawColSep = "..",
  colSep = "_",
  changeDtype = TRUE
)

Arguments

`rawDataFrame`	raw data.frame
`rawRowSep`	raw separation deliminator of row names in raw data.frame
`rowSep`	the new separation deliminator of row names
`rawColSep`	raw separation deliminator of col names in raw data.frame
`colSep`	the new separation deliminator of col names
`changeDtype`	if change the dtypes of cols

Value

A modified data.frame with applied to above all modifiers

Examples

library(handyFunctions)
data(people)
unifyDataframe(people,rawColSep = "[.][.]")

library(handyFunctions)
data(people)
unifyDataframe(people,rawColSep = "[.][.]")

Package 'handyFunctions'

Help Index

check the validation and return index of cols given from input in rawDataFrame

Description

Usage

Arguments

Value

Examples

Return suggested dtype of vector input

Description

Usage

Arguments

Value

Examples

Grade records of virtual persons in high school

Description

Usage

Format

Source

Return the index of source vector matched with query vector

Description

Usage

Arguments

Value

Examples

merge two data.frame based on xcol and ycol

Description

Usage

Arguments

Value

Examples

Return reformatted data.frame with standard col names

Description

Usage

Arguments

Value

Examples

Return suggested appropriate dtypes for each column in rawDataFrame

Description

Usage

Arguments

Value

Examples

Return reformatted data.frame with standard row names

Description

Usage

Arguments

Value

Examples

Basic information of virtual persons

Description

Usage

Format

Source

return index of x data.frame with the given vector/list or ycol in data.frame (if set the accurate match or not)

Description

Usage

Arguments

Value

Examples

Function of showing SNP density at chromosome level

Description

Usage

Arguments

Value

Examples

The SNPV number within 1Mb bins at chromosome levels generated from transcriptome dataset of two dog populations (including wild wolf and domesticated dogs).

Description

Usage

Format

Source

Return specific-indexed vector according to given delimitator/separator by splitting one col in data.frame

Description

Usage

Arguments

Value

Examples

Reformat dataframe with the all modifiers simultaneously (colNames, rowNames and dtypes)

Description

Usage