KD-Tree / MyKD-Tree Range Query

Contributors

Andrew Sandell: [email protected]

Mack Hummel: [email protected]

Instructions

To run this project, navigate to the Sandell directory and use the makefile to compile.

Sequential Array Outputs will be in SequentialOutput.txt

KD-Tree Outputs will be in KD-TreeOutput.txt

MyKD-Tree Outputs will be in MyKD-TreeOutput.txt

Implementation

The variation of the KD-Tree that we implemented relied on the concept of carefully choosing which dimension to use when dividing a node. Our solution was to determine the dimension of the points with the greatest standard deviation, then divide along that dimension to create the most useful split possible. We accomplish this by proceeding with the normal KD-Tree insertion algorithm, up until we arrive at a block overflow. At this point, we loop over the points contained within that block, once for each dimension, and determine which of the dimensions has the greatest standard deviation. Once this is accomplished, we set this dimension as the split dimension, and proceed with the rest of our insertion algorithm. Finally, before recursively adding our nodes, we check to see if the chosen dimension has a standard deviation of 0. If so, we know that the points in this block are identical, so we split them evenly between two trees. We believe that this strategy will improve performance in datasets with high dimensionality, which have a number of points spread unevenly amongst a single dimension.

Performance

From our tests, we observed that there is a slight difference in query time with the MyKD-Tree for queries with lower ranges of values. At higher sizes of query ranges, the MyKD-Tree performs equal to, or better than the KD-Tree. This is a promising result and shows that the approach might have potential, especially for datasets with high standard deviations along a single axis, or datasets in which large points in higher-dimensional spaces are clustered together. However, we cannot conclude if the changes implemented in the MyKD-Tree are significantly more efficient than the KD-Tree at scale, or at dimensionality of k >= 3.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
LICENSE		LICENSE
README.md		README.md
makefile		makefile
projDB		projDB
projquery		projquery
rangeQ.cpp		rangeQ.cpp
rangeQ.h		rangeQ.h
test2.txt		test2.txt
test3.txt		test3.txt
test4.txt		test4.txt
test5.txt		test5.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KD-Tree / MyKD-Tree Range Query

Contributors

Instructions

Implementation

Performance

About

Releases 1

Packages

Contributors 2

Languages

License

absandell/KD-Tree-Range-Query

Folders and files

Latest commit

History

Repository files navigation

KD-Tree / MyKD-Tree Range Query

Contributors

Instructions

Implementation

Performance

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages