Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

codespell spelling fixes #80

Merged
merged 2 commits into from
Oct 25, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -313,7 +313,7 @@ This by default configures without any of the extra build tasks (such as buildin
| -DKOMPUTE_VK_API_MAJOR_VERSION=1 | Major version to use for the Vulkan API |
| -DKOMPUTE_VK_API_MINOR_VERSION=1 | Minor version to use for the Vulkan API |
| -DKOMPUTE_ENABLE_SPDLOG=1 | Enables the build with SPDLOG and FMT dependencies (must be installed) |
| -DKOMPUTE_LOG_VERRIDE=1 | Does not define the SPDLOG_<LEVEL> macros if these are to be overriden |
| -DKOMPUTE_LOG_VERRIDE=1 | Does not define the SPDLOG_<LEVEL> macros if these are to be overridden |
| -DSPDLOG_ACTIVE_LEVEL | The level for the log level on compile level (whether spdlog is enabled) |
| -DVVK_USE_PLATFORM_ANDROID_KHR | Flag to enable android imports in kompute (enabled with -DKOMPUTE_OPT_ANDROID_BUILD) |
| -DRELEASE=1 | Enable release build (enabled by cmake release build) |
Expand Down Expand Up @@ -368,7 +368,7 @@ We appreciate PRs and Issues. If you want to contribute try checking the "Good f
* Uses cmake as build system, and provides a top level makefile with recommended command
* Uses xxd (or xxd.exe windows 64bit port) to convert shader spirv to header files
* Uses doxygen and sphinx for documentation and autodocs
* Uses vcpkg for finding the dependencies, it's the recommanded set up to retrieve the libraries
* Uses vcpkg for finding the dependencies, it's the recommended set up to retrieve the libraries

##### Updating documentation

Expand Down
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Index
Asynchronous & Parallel Operations <overview/async-parallel>
Memory Management Principles <overview/memory-management>
Converting GLSL/HLSL Shaders to C++ Headers <overview/shaders-to-headers>
Mobile App Intergration (Android) <overview/mobile-android>
Mobile App Integration (Android) <overview/mobile-android>
Game Engine Integration (Godot Engine) <overview/game-engine-godot>
Code Index <genindex>

14 changes: 7 additions & 7 deletions docs/overview/advanced-examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -218,10 +218,10 @@ Back to `examples list <#simple-examples>`_.
// In this case we select device 0, and for queues, one queue from familyIndex 0
// and one queue from familyIndex 2
uint32_t deviceIndex(0);
std::vector<uint32_t> familyIndeces = {0, 2};
std::vector<uint32_t> familyIndices = {0, 2};

// We create a manager with device index, and queues by queue family index
kp::Manager mgr(deviceIndex, familyIndeces);
kp::Manager mgr(deviceIndex, familyIndices);

// We need to create explicit sequences with their respective queues
// The second parameter is the index in the familyIndex array which is relative
Expand Down Expand Up @@ -276,7 +276,7 @@ Back to `examples list <#simple-examples>`_.

// Here we can do other work

// We can now wait for thw two parallel tasks to finish
// We can now wait for the two parallel tasks to finish
mgr.evalOpAwait("queueOne")
mgr.evalOpAwait("queueTwo")

Expand Down Expand Up @@ -415,7 +415,7 @@ Converting to Kompute Terminology

1. Create a Sequence to record and submit GPU commands
2. Submit OpCreateTensor to create all the tensors
3. Record the OpAlgo with the Logistic Regresion shader
3. Record the OpAlgo with the Logistic Regression shader
4. Loop across number of iterations:
4-a. Submit algo operation on LR shader
4-b. Re-calculate weights from loss
Expand Down Expand Up @@ -454,10 +454,10 @@ Converting to Kompute Terminology



#. Record the OpAlgo with the Logistic Regresion shader
#. Record the OpAlgo with the Logistic Regression shader
:raw-html-m2r:`<del>~</del>`\ :raw-html-m2r:`<del>~</del>`\ :raw-html-m2r:`<del>~</del>`\ :raw-html-m2r:`<del>~</del>`\ ~~

Once we re-record, all the instructions that were recorded previosuly are cleared.
Once we re-record, all the instructions that were recorded previously are cleared.

Because of this we can record now the new commands which will consist of the following:

Expand Down Expand Up @@ -526,7 +526,7 @@ Because of this we can record now the new commands which will consist of the fol
// Run evaluation which passes data through shader once
sq->eval();

// Substract the resulting weights and biases
// Subtract the resulting weights and biases
for(size_t j = 0; j < bOut->size(); j++) {
wInVec[0] -= wOutI->data()[j];
wInVec[1] -= wOutJ->data()[j];
Expand Down
10 changes: 5 additions & 5 deletions docs/overview/async-parallel.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ Sequences can be executed in synchronously or asynchronously without having to c

While this is running we can actually do other things like in this case create the shader we'll be using.

In this case we create a shader that shoudl take a couple of milliseconds to run.
In this case we create a shader that should take a couple of milliseconds to run.

.. code-block:: cpp
:linenos:
Expand Down Expand Up @@ -164,7 +164,7 @@ Let's take a tangible example. The [NVIDIA 1650](http://vulkan.gpuinfo.org/displ

With this in mind, the NVIDIA 1650 as of today does not support intra-family parallelization, which means that if you were to submit commands in multiple queues of the same family, these would still be exectured synchronously.

However the NVIDIA 1650 does support inter-family parallelization, which menas that if we were to submit commands across multiple queues from different families, these would execute in parallel.
However the NVIDIA 1650 does support inter-family parallelization, which means that if we were to submit commands across multiple queues from different families, these would execute in parallel.

This means that we would be able to execute parallel workloads as long as we're running them across multiple queue families. This is one of the reasons why Vulkan Kompute enables users to explicitly select the underlying queues and queue families to run particular workloads on.

Expand All @@ -189,10 +189,10 @@ You will want to keep track of the indices you initialize your manager, as you w
// In this case we select device 0, and for queues, one queue from familyIndex 0
// and one queue from familyIndex 2
uint32_t deviceIndex(0);
std::vector<uint32_t> familyIndeces = {0, 2};
std::vector<uint32_t> familyIndices = {0, 2};

// We create a manager with device index, and queues by queue family index
kp::Manager mgr(deviceIndex, familyIndeces);
kp::Manager mgr(deviceIndex, familyIndices);

We are now able to create sequences with a particular queue.

Expand Down Expand Up @@ -281,7 +281,7 @@ We are able to wait for the tasks to complete by triggering the `evalOpAwait` on

// Here we can do other work

// We can now wait for thw two parallel tasks to finish
// We can now wait for the two parallel tasks to finish
mgr.evalOpAwait("queueOne")
mgr.evalOpAwait("queueTwo")

Expand Down
24 changes: 12 additions & 12 deletions single_include/kompute/Kompute.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -690,7 +690,7 @@ namespace kp {
*
* Tensors are the base building block in Kompute to perform operations across
* GPUs. Each tensor would have a respective Vulkan memory and buffer, which
* woudl be used to store their respective data. The tensors can be used for GPU
* would be used to store their respective data. The tensors can be used for GPU
* data storage or transfer.
*/
class Tensor
Expand Down Expand Up @@ -733,7 +733,7 @@ class Tensor
/**
* Initialiser which calls the initialisation for all the respective tensors
* as well as creates the respective staging tensors. The staging tensors
* woudl only be created for the tensors of type TensorType::eDevice as
* would only be created for the tensors of type TensorType::eDevice as
* otherwise there is no need to copy from host memory.
*/
void init(std::shared_ptr<vk::PhysicalDevice> physicalDevice,
Expand Down Expand Up @@ -1267,12 +1267,12 @@ class Manager
* they would like to create the resources on.
*
* @param physicalDeviceIndex The index of the physical device to use
* @param familyQueueIndeces (Optional) List of queue indeces to add for
* @param familyQueueIndices (Optional) List of queue indices to add for
* explicit allocation
* @param totalQueues The total number of compute queues to create.
*/
Manager(uint32_t physicalDeviceIndex,
const std::vector<uint32_t>& familyQueueIndeces = {});
const std::vector<uint32_t>& familyQueueIndices = {});

/**
* Manager constructor which allows your own vulkan application to integrate
Expand Down Expand Up @@ -1509,7 +1509,7 @@ class Manager
std::unordered_map<std::string, std::shared_ptr<Sequence>>
mManagedSequences;

std::vector<uint32_t> mComputeQueueFamilyIndeces;
std::vector<uint32_t> mComputeQueueFamilyIndices;
std::vector<std::shared_ptr<vk::Queue>> mComputeQueues;

uint32_t mCurrentSequenceIndex = -1;
Expand All @@ -1523,7 +1523,7 @@ class Manager

// Create functions
void createInstance();
void createDevice(const std::vector<uint32_t>& familyQueueIndeces = {});
void createDevice(const std::vector<uint32_t>& familyQueueIndices = {});
};

} // End namespace kp
Expand Down Expand Up @@ -1556,7 +1556,7 @@ class Algorithm
std::shared_ptr<vk::CommandBuffer> commandBuffer);

/**
* Initialiser for the shader data provided to the algoithm as well as
* Initialiser for the shader data provided to the algorithm as well as
* tensor parameters that will be used in shader.
*
* @param shaderFileData The bytes in spir-v format of the shader
Expand Down Expand Up @@ -1707,7 +1707,7 @@ class OpAlgoBase : public OpBase
* the barriers that ensure the memory has been copied before going in and
* out of the shader, as well as the dispatch operation that sends the
* shader processing to the gpu. This function also records the GPU memory
* copy of the output data for the staging bufffer so it can be read by the
* copy of the output data for the staging buffer so it can be read by the
* host.
*/
virtual void record() override;
Expand Down Expand Up @@ -1745,7 +1745,7 @@ class OpAlgoBase : public OpBase

} // End namespace kp

// Including implemenation for template class
// Including implementation for template class
#ifndef OPALGOBASE_IMPL
#define OPALGOBASE_IMPL

Expand Down Expand Up @@ -1972,7 +1972,7 @@ class OpAlgoLhsRhsOut : public OpAlgoBase<tX, tY, tZ>
* the barriers that ensure the memory has been copied before going in and
* out of the shader, as well as the dispatch operation that sends the
* shader processing to the gpu. This function also records the GPU memory
* copy of the output data for the staging bufffer so it can be read by the
* copy of the output data for the staging buffer so it can be read by the
* host.
*/
virtual void record() override;
Expand All @@ -1996,7 +1996,7 @@ class OpAlgoLhsRhsOut : public OpAlgoBase<tX, tY, tZ>

} // End namespace kp

// Including implemenation for template class
// Including implementation for template class
#ifndef OPALGOLHSRHSOUT_CPP
#define OPALGOLHSRHSOUT_CPP

Expand Down Expand Up @@ -2247,7 +2247,7 @@ class OpTensorCopy : public OpBase
void init() override;

/**
* Records the copy commands from teh first tensor into all the other tensors provided. Also optionally records a barrier.
* Records the copy commands from the first tensor into all the other tensors provided. Also optionally records a barrier.
*/
void record() override;

Expand Down
20 changes: 10 additions & 10 deletions src/Manager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,12 @@ Manager::Manager()
{}

Manager::Manager(uint32_t physicalDeviceIndex,
const std::vector<uint32_t>& familyQueueIndeces)
const std::vector<uint32_t>& familyQueueIndices)
{
this->mPhysicalDeviceIndex = physicalDeviceIndex;

this->createInstance();
this->createDevice(familyQueueIndeces);
this->createDevice(familyQueueIndices);
}

Manager::Manager(std::shared_ptr<vk::Instance> instance,
Expand Down Expand Up @@ -119,7 +119,7 @@ Manager::createManagedSequence(std::string sequenceName, uint32_t queueIndex)
std::make_shared<Sequence>(this->mPhysicalDevice,
this->mDevice,
this->mComputeQueues[queueIndex],
this->mComputeQueueFamilyIndeces[queueIndex]);
this->mComputeQueueFamilyIndices[queueIndex]);
sq->init();

if (sequenceName.empty()) {
Expand All @@ -128,7 +128,7 @@ Manager::createManagedSequence(std::string sequenceName, uint32_t queueIndex)
{ KP_DEFAULT_SESSION + std::to_string(this->mCurrentSequenceIndex),
sq });
} else {
// TODO: Check if sequence doens't already exist
// TODO: Check if sequence doesn't already exist
this->mManagedSequences.insert({ sequenceName, sq });
}
return sq;
Expand Down Expand Up @@ -220,7 +220,7 @@ Manager::createInstance()
}

void
Manager::createDevice(const std::vector<uint32_t>& familyQueueIndeces)
Manager::createDevice(const std::vector<uint32_t>& familyQueueIndices)
{

SPDLOG_DEBUG("Kompute Manager creating Device");
Expand Down Expand Up @@ -251,7 +251,7 @@ Manager::createDevice(const std::vector<uint32_t>& familyQueueIndeces)
this->mPhysicalDeviceIndex,
physicalDeviceProperties.deviceName);

if (!familyQueueIndeces.size()) {
if (!familyQueueIndices.size()) {
// Find compute queue
std::vector<vk::QueueFamilyProperties> allQueueFamilyProperties =
physicalDevice.getQueueFamilyProperties();
Expand All @@ -272,14 +272,14 @@ Manager::createDevice(const std::vector<uint32_t>& familyQueueIndeces)
throw std::runtime_error("Compute queue is not supported");
}

this->mComputeQueueFamilyIndeces.push_back(computeQueueFamilyIndex);
this->mComputeQueueFamilyIndices.push_back(computeQueueFamilyIndex);
} else {
this->mComputeQueueFamilyIndeces = familyQueueIndeces;
this->mComputeQueueFamilyIndices = familyQueueIndices;
}

std::unordered_map<uint32_t, uint32_t> familyQueueCounts;
std::unordered_map<uint32_t, std::vector<float>> familyQueuePriorities;
for (const auto& value : this->mComputeQueueFamilyIndeces) {
for (const auto& value : this->mComputeQueueFamilyIndices) {
familyQueueCounts[value]++;
familyQueuePriorities[value].push_back(1.0f);
}
Expand Down Expand Up @@ -308,7 +308,7 @@ Manager::createDevice(const std::vector<uint32_t>& familyQueueIndeces)
&deviceCreateInfo, nullptr, this->mDevice.get());
SPDLOG_DEBUG("Kompute Manager device created");

for (const uint32_t& familyQueueIndex : this->mComputeQueueFamilyIndeces) {
for (const uint32_t& familyQueueIndex : this->mComputeQueueFamilyIndices) {
std::shared_ptr<vk::Queue> currQueue = std::make_shared<vk::Queue>();

this->mDevice->getQueue(familyQueueIndex,
Expand Down
2 changes: 1 addition & 1 deletion src/include/kompute/Algorithm.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ class Algorithm
std::shared_ptr<vk::CommandBuffer> commandBuffer);

/**
* Initialiser for the shader data provided to the algoithm as well as
* Initialiser for the shader data provided to the algorithm as well as
* tensor parameters that will be used in shader.
*
* @param shaderFileData The bytes in spir-v format of the shader
Expand Down
8 changes: 4 additions & 4 deletions src/include/kompute/Manager.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,12 @@ class Manager
* they would like to create the resources on.
*
* @param physicalDeviceIndex The index of the physical device to use
* @param familyQueueIndeces (Optional) List of queue indeces to add for
* @param familyQueueIndices (Optional) List of queue indices to add for
* explicit allocation
* @param totalQueues The total number of compute queues to create.
*/
Manager(uint32_t physicalDeviceIndex,
const std::vector<uint32_t>& familyQueueIndeces = {});
const std::vector<uint32_t>& familyQueueIndices = {});

/**
* Manager constructor which allows your own vulkan application to integrate
Expand Down Expand Up @@ -271,7 +271,7 @@ class Manager
std::unordered_map<std::string, std::shared_ptr<Sequence>>
mManagedSequences;

std::vector<uint32_t> mComputeQueueFamilyIndeces;
std::vector<uint32_t> mComputeQueueFamilyIndices;
std::vector<std::shared_ptr<vk::Queue>> mComputeQueues;

uint32_t mCurrentSequenceIndex = -1;
Expand All @@ -285,7 +285,7 @@ class Manager

// Create functions
void createInstance();
void createDevice(const std::vector<uint32_t>& familyQueueIndeces = {});
void createDevice(const std::vector<uint32_t>& familyQueueIndices = {});
};

} // End namespace kp
4 changes: 2 additions & 2 deletions src/include/kompute/Tensor.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ namespace kp {
*
* Tensors are the base building block in Kompute to perform operations across
* GPUs. Each tensor would have a respective Vulkan memory and buffer, which
* woudl be used to store their respective data. The tensors can be used for GPU
* would be used to store their respective data. The tensors can be used for GPU
* data storage or transfer.
*/
class Tensor
Expand Down Expand Up @@ -54,7 +54,7 @@ class Tensor
/**
* Initialiser which calls the initialisation for all the respective tensors
* as well as creates the respective staging tensors. The staging tensors
* woudl only be created for the tensors of type TensorType::eDevice as
* would only be created for the tensors of type TensorType::eDevice as
* otherwise there is no need to copy from host memory.
*/
void init(std::shared_ptr<vk::PhysicalDevice> physicalDevice,
Expand Down
4 changes: 2 additions & 2 deletions src/include/kompute/operations/OpAlgoBase.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ class OpAlgoBase : public OpBase
* the barriers that ensure the memory has been copied before going in and
* out of the shader, as well as the dispatch operation that sends the
* shader processing to the gpu. This function also records the GPU memory
* copy of the output data for the staging bufffer so it can be read by the
* copy of the output data for the staging buffer so it can be read by the
* host.
*/
virtual void record() override;
Expand Down Expand Up @@ -143,7 +143,7 @@ class OpAlgoBase : public OpBase

} // End namespace kp

// Including implemenation for template class
// Including implementation for template class
#ifndef OPALGOBASE_IMPL
#define OPALGOBASE_IMPL

Expand Down
4 changes: 2 additions & 2 deletions src/include/kompute/operations/OpAlgoLhsRhsOut.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ class OpAlgoLhsRhsOut : public OpAlgoBase<tX, tY, tZ>
* the barriers that ensure the memory has been copied before going in and
* out of the shader, as well as the dispatch operation that sends the
* shader processing to the gpu. This function also records the GPU memory
* copy of the output data for the staging bufffer so it can be read by the
* copy of the output data for the staging buffer so it can be read by the
* host.
*/
virtual void record() override;
Expand All @@ -87,7 +87,7 @@ class OpAlgoLhsRhsOut : public OpAlgoBase<tX, tY, tZ>

} // End namespace kp

// Including implemenation for template class
// Including implementation for template class
#ifndef OPALGOLHSRHSOUT_CPP
#define OPALGOLHSRHSOUT_CPP

Expand Down
Loading