Title: | A Cost-Minimal Regular Spanning Subgraph with TreeClust |
---|---|
Description: | Construct minimum-cost regular spanning subgraph as part of a non-parametric two-sample test for equality of distribution. |
Authors: | Dave Ruth, Sam Buttrey |
Maintainer: | Sam Buttrey <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0-3 |
Built: | 2024-11-23 03:02:05 UTC |
Source: | https://github.com/cran/AcrossTic |
Construct minimum-cost regular spanning subgraph as part of a non-parametric two-sample test for equality of distribution.
The DESCRIPTION file:
Package: | AcrossTic |
Version: | 1.0-3 |
Date: | 2016-08-12 |
Title: | A Cost-Minimal Regular Spanning Subgraph with TreeClust |
Author: | Dave Ruth, Sam Buttrey |
Maintainer: | Sam Buttrey <[email protected]> |
Depends: | treeClust (>= 1.1-6), lpSolve |
Description: | Construct minimum-cost regular spanning subgraph as part of a non-parametric two-sample test for equality of distribution. |
License: | GPL (>= 2) |
NeedsCompilation: | no |
Packaged: | 2016-08-12 22:41:34 UTC; sebuttre |
Date/Publication: | 2016-08-13 11:01:25 |
Repository: | https://buttrey.r-universe.dev |
RemoteUrl: | https://github.com/cran/AcrossTic |
RemoteRef: | HEAD |
RemoteSha: | 76a9a22f1e01cb9c451f430bdbe68e4f80b51a82 |
Index of help topics:
rRegMatch Compute r-regular matching spanning trees print.AcrossTic print an AcrossTic object, the output from rRegMatch plot.AcrossTic print an AcrossTic object, the output from rRegMatch ptest perform permutation test on AcrossTic object print.AcrossTicPtest print an AcrossTic permutation test object
This primarily provides rRegMatch, which for arguments X and r produces a minimum-distance r-regular subgraph of the rows of X.
Dave Ruth, Sam Buttrey
Maintainer: Sam Buttrey <[email protected]>
David Ruth, "A new multivariate two-sample test using regular minimum-weight spanning subgraphs," J. Stat. Distributions and Applications (2014)
set.seed (123) X <- matrix (rnorm (100), 50, 2) # Create data... y <- rep (c (1, 2), each=25) # ...and class membership ## Not run: rRegMatch (X, r = 3, y = y)
set.seed (123) X <- matrix (rnorm (100), 50, 2) # Create data... y <- rep (c (1, 2), each=25) # ...and class membership ## Not run: rRegMatch (X, r = 3, y = y)
Plot an object of class AcrossTic (see details). Currently intended for two-class objects built with two-dimensional Xs.
## S3 method for class 'AcrossTic' plot(x, X.values, y, grp.cols = c(2, 4), grp.pch = c(16, 17), ...)
## S3 method for class 'AcrossTic' plot(x, X.values, y, grp.cols = c(2, 4), grp.pch = c(16, 17), ...)
x |
AcrossTic object, normally the output from |
X.values |
Matrix of data. If not spplied the function looks in |
y |
Vector with two distinct values giving the label for each observation. |
grp.cols |
Colors for the two groups. Default: 2 and 4. |
grp.pch |
Plotting points for the two groups. Default: 16 and 17. |
... |
Other arguments, passed on to |
This demonstrates a graph of the matching of the rRegMatch type. Points are plotted in 2d; then within-group matches are shown with dotted lines and between-group pairings with solid ones. If X has more than two columns, the first two are used, with a warning. If Y is supplied it will be used; if not, it will be extracted from x; if no y is found, an error is issued. Y must have exactly two distinct values.
No output. Side effect: a plot is produced.
David Ruth and Sam Buttrey
set.seed (123) X <- matrix (rnorm (100), 50, 2) # Create data... y <- rep (c (0, 1), each=25) # ...and class membership plot (rRegMatch (X, r = 3, y = y))
set.seed (123) X <- matrix (rnorm (100), 50, 2) # Create data... y <- rep (c (0, 1), each=25) # ...and class membership plot (rRegMatch (X, r = 3, y = y))
Print some attributes of an AcrossTic object to the screen.
## S3 method for class 'AcrossTic' print(x, ...)
## S3 method for class 'AcrossTic' print(x, ...)
x |
AcrossTic item (output from |
... |
Other arguments, currently ignored. |
None.
Sam Buttrey
Print the output of a permutation test on an AcrossTic object (see ptest
)
## S3 method for class 'AcrossTicPtest' print(x, ...)
## S3 method for class 'AcrossTicPtest' print(x, ...)
x |
Object of class AcrossTicPtest |
... |
Other arguments, currently ignored. |
The output from ptest
has class AcrossTicPtest. This function prints such an object.
None
Sam Buttrey
This function permutes the "y" entries in an AcrossTic object and computes the cross-count statistic for each permutation. This generates a null distribution suitable for use in a permutation test.
ptest(acobj, y, edge.weights, n = 1000)
ptest(acobj, y, edge.weights, n = 1000)
acobj |
Object of class AcrossTic, output from |
y |
Character, factor or logical indicating class membership for each observation.
Normally this will be found inside |
edge.weights |
Vector of weights associated with each match. If omitted, the default is a vector of 1's of the proper length, unless the "acobj" object was computed with partial matching, in which case omitting edge.weights produces an error. |
n |
Integer, number of simulations. Default, 1000. |
This function permutes the y's n
times and computes the cross-count-match statistic.
If the observed value in the acobj
is generally smaller than the permuted values,
we conclude the distributions of the classes are different.
A list with class AcrossTicPtest and three components:
sims |
Vector of |
observed |
Observed cross-count statistic |
p.value |
P-value for test |
Sam Buttrey and Dave Ruth
set.seed (123) X <- matrix (rnorm (100), 50, 2) # Create data... y <- rep (c ("One", "Two"), each=25) # ...and class membership ## Not run: ptest (rRegMatch (X, r = 3, y = y)) # p = .479 X[1:25,] <- X[1:25,] + 1 ## Not run: ptest (rRegMatch (X, r = 3, y = y)) # p = .037
set.seed (123) X <- matrix (rnorm (100), 50, 2) # Create data... y <- rep (c ("One", "Two"), each=25) # ...and class membership ## Not run: ptest (rRegMatch (X, r = 3, y = y)) # p = .479 X[1:25,] <- X[1:25,] + 1 ## Not run: ptest (rRegMatch (X, r = 3, y = y)) # p = .037
This function matches each observation in X to r others so as to minimize the total distance across all matches. Optionally it computes the cross-count statistic – the number of matches associated with two observations from different classes.
rRegMatch(X, r, y = NULL, dister = "daisy", dist.args = list(), keep.X = nrow(X) < 100, keep.D = (dister == "treeClust.dist"), relax = (N >= 100), thresh = 1e-6)
rRegMatch(X, r, y = NULL, dister = "daisy", dist.args = list(), keep.X = nrow(X) < 100, keep.D = (dister == "treeClust.dist"), relax = (N >= 100), thresh = 1e-6)
X |
Matrix or data frame of data, or inter-point distances represented in an object inheriting from "dist" |
r |
Integer number of matches. The matching is "regular" in that every observation is matched to exactly r others (or, if relax=TRUE, every observation is matched to others with weights in [0, 1] that add up to r). |
y |
Vector of class membership indices. This is used to compute the cross-count statistic. Optional. |
dister |
Function to compute inter-point distances. This must take as its first argument
a matrix of data argument name |
dist.args |
List of argument to the |
keep.X |
If TRUE, and X was supplied, keep the X matrix in the output object. Default: TRUE if X was supplied and also nrow (X) < 100. |
keep.D |
If TRUE, keep the distance object in the output. Default: TRUE if the
|
relax |
If FALSE, solve the exact problem where each observation gets exactly r non-zero pairings, each with weight 1. If TRUE, solve the relaxed problem, where each observation has at least r non-zero pairings, each with its own weight between 0 and 1, the weights adding up to r. The exact problem gets very slow with large samples. |
thresh |
Weights smaller than this are considered to be exactly zero. Default: 1e-6. |
This function solves an optimization problem to extract the set of pairings which make the total weight (distance) associated with all pairings a minimum, subject to the constraint that every observation is paired to r others (or to enough others to have a total pair-weight of r).
A list of class AcrossTic, with elements:
matches |
A two-column matrix, each row gving the indices of one matched pair. |
total.dist |
total distance across all matches – the optimal value from the optimization problem. |
status |
Status of result – if the optimum was found, a vector of length 1 with name "TM_OPTIMAL_SOLUTION_FOUND" and value 0. |
time.required |
Time taken to run the optimization, as reported by |
call |
The call made to the function, from |
r |
The value of r, as supplied at the time of the call. |
dister |
The value of dister, as supplied at the time of the call. |
dist.args |
The value of dist.args, as supplied at the time of the call. |
X.supplied |
Logical indicating whether X was supplied. |
X |
X matrix, if it was available and asked to be kept |
y |
y vector, as supplied |
edge.weights |
vector, of length |
cross.sum |
Sum of matcher.costs across all matches |
cross.count |
Number of matches between two observations of different classes, possibly weighted |
nrow.X , ncol.X
|
dimension of X matrix |
David Ruth and Sam Buttrey
David Ruth, "A new multivariate two-sample test using regular minimum-weight spanning subgraphs," J. Stat. Distributions and Applications (2014)
set.seed (123) X <- matrix (rnorm (100), 50, 2) # Create data... y <- rep (c (1, 2), each=25) # ...and class membership rRegMatch (X, r = 3, y = y) ## Not run: plot (rRegMatch (X, r = 3, y = y)) # to see picture
set.seed (123) X <- matrix (rnorm (100), 50, 2) # Create data... y <- rep (c (1, 2), each=25) # ...and class membership rRegMatch (X, r = 3, y = y) ## Not run: plot (rRegMatch (X, r = 3, y = y)) # to see picture