|
1 | 1 | # C++ High Performance Computing Optimization Guide |
2 | 2 |
|
3 | | -English | [简体中文](README.zh-CN.md) | [GitBook Sync Guide](docs/en/gitbook-sync.md) |
| 3 | +English | [简体中文](README.zh-CN.md) | [Documentation Home](https://lessup.github.io/cpp-high-performance-guide/) |
4 | 4 |
|
5 | | -[](https://github.com/LessUp/cpp-high-performance-guide/actions/workflows/build.yml) |
6 | | -[](https://github.com/LessUp/cpp-high-performance-guide/actions/workflows/benchmark.yml) |
7 | | -[](https://github.com/LessUp/cpp-high-performance-guide/actions/workflows/sanitizers.yml) |
8 | | -[](https://lessup.github.io/cpp-high-performance-guide/) |
| 5 | +A modern C++20 example collection for learning performance engineering across build systems, memory and cache behavior, SIMD, concurrency, benchmarking, and profiling. |
9 | 6 |
|
10 | | -A comprehensive collection of high-performance computing optimization examples and best practices for modern C++20. |
| 7 | +## Repository Overview |
11 | 8 |
|
12 | | -## Features |
13 | | - |
14 | | -- **Modern CMake Build System** - Target-based CMake with presets, FetchContent dependencies |
15 | | -- **Memory & Cache Optimization** - AOS vs SOA, false sharing, alignment, prefetching |
16 | | -- **Modern C++ Features** - constexpr, move semantics, vector reserve, C++20 ranges |
17 | | -- **SIMD Vectorization** - Auto-vectorization, SSE/AVX2/AVX-512 intrinsics, wrapper library |
18 | | -- **Concurrency** - Atomic operations, lock-free queues, OpenMP integration |
19 | | -- **Benchmarking Framework** - Google Benchmark integration, FlameGraph generation |
| 9 | +- `examples/`: five themed modules covering modern CMake, memory and cache optimization, modern C++ features, SIMD, and concurrency. |
| 10 | +- `benchmarks/` and `tools/`: benchmark runners, FlameGraph helpers, and analysis scripts. |
| 11 | +- `docs/`: bilingual learning path, profiling guide, and HonKit or GitBook synchronization notes. |
| 12 | +- `tests/`: unit, integration, and property-style checks for the example collection. |
20 | 13 |
|
21 | 14 | ## Quick Start |
22 | 15 |
|
23 | | -### Prerequisites |
24 | | - |
25 | | -- C++20 compatible compiler (GCC 11+, Clang 14+) |
26 | | -- CMake 3.20+ |
27 | | -- Ninja (recommended) or Make |
28 | | - |
29 | | -### Build |
30 | | - |
31 | 16 | ```bash |
32 | | -# Clone the repository |
33 | | -git clone https://github.com/LessUp/cpp-high-performance-guide.git |
34 | | -cd cpp-high-performance-guide |
35 | | - |
36 | | -# Configure and build (Release mode) |
37 | 17 | cmake --preset=release |
38 | 18 | cmake --build build/release |
39 | | - |
40 | | -# Run all benchmarks |
41 | | -cd build/release && ctest --output-on-failure |
42 | | -``` |
43 | | - |
44 | | -### Available Presets |
45 | | - |
46 | | -| Preset | Description | |
47 | | -|--------|-------------| |
48 | | -| `debug` | Debug build with symbols | |
49 | | -| `release` | Optimized release build (-O3, -march=native) | |
50 | | -| `asan` | AddressSanitizer enabled | |
51 | | -| `tsan` | ThreadSanitizer enabled | |
52 | | - |
53 | | -```bash |
54 | | -# Build with sanitizers |
55 | | -cmake --preset=asan |
56 | | -cmake --build build/asan |
57 | | -``` |
58 | | - |
59 | | -## Project Structure |
60 | | - |
61 | | -``` |
62 | | -cpp-high-performance-guide/ |
63 | | -├── cmake/ # CMake modules |
64 | | -│ ├── CompilerOptions.cmake # Compiler flags management |
65 | | -│ ├── Dependencies.cmake # FetchContent dependencies |
66 | | -│ ├── Sanitizers.cmake # Sanitizer configuration |
67 | | -│ └── ExampleTemplate.cmake # Example module template |
68 | | -├── examples/ |
69 | | -│ ├── 01-cmake-modern/ # CMake best practices vs anti-patterns |
70 | | -│ ├── 02-memory-cache/ # Memory and cache optimization |
71 | | -│ ├── 03-modern-cpp/ # Modern C++ features |
72 | | -│ ├── 04-simd-vectorization/ # SIMD and vectorization |
73 | | -│ └── 05-concurrency/ # Concurrent programming |
74 | | -├── benchmarks/ # Benchmark utilities |
75 | | -├── tests/ # Unit and property tests |
76 | | -├── tools/ # Analysis and profiling tools |
77 | | -└── docs/ # Documentation |
78 | | -``` |
79 | | - |
80 | | -## Example Modules |
81 | | - |
82 | | -### 01 - Modern CMake |
83 | | -Learn CMake best practices through anti-pattern vs best-practice comparisons. |
84 | | -- Target-based vs directory-based commands |
85 | | -- FetchContent dependency management |
86 | | -- Compiler options management |
87 | | - |
88 | | -### 02 - Memory & Cache Optimization |
89 | | -Master cache-friendly programming techniques. |
90 | | -- **AOS vs SOA**: Data layout impact on performance |
91 | | -- **False Sharing**: Multi-threaded cache line contention |
92 | | -- **Alignment**: SIMD-friendly memory alignment |
93 | | -- **Prefetching**: Manual prefetch hints |
94 | | - |
95 | | -### 03 - Modern C++ Features |
96 | | -Leverage modern C++ for performance. |
97 | | -- **constexpr/consteval**: Compile-time computation |
98 | | -- **Move Semantics**: Avoid unnecessary copies |
99 | | -- **Vector Reserve**: Reduce allocations |
100 | | -- **C++20 Ranges**: Modern iteration patterns |
101 | | - |
102 | | -### 04 - SIMD Vectorization |
103 | | -Maximize CPU throughput with SIMD. |
104 | | -- **Auto-vectorization**: Compiler-friendly patterns |
105 | | -- **Intrinsics**: SSE, AVX2, AVX-512 examples |
106 | | -- **SIMD Wrapper**: Readable SIMD abstractions |
107 | | - |
108 | | -### 05 - Concurrency |
109 | | -Write efficient multi-threaded code. |
110 | | -- **Atomic Operations**: Memory ordering explained |
111 | | -- **Lock-Free Queue**: SPSC queue implementation |
112 | | -- **OpenMP**: Simple parallelization patterns |
113 | | - |
114 | | -## Running Benchmarks |
115 | | - |
116 | | -```bash |
117 | | -# Run all benchmarks |
118 | | -cd build/release |
119 | | -ctest --output-on-failure |
120 | | - |
121 | | -# Run specific benchmark |
122 | | -./examples/02-memory-cache/bench/aos_soa_bench |
123 | | - |
124 | | -# Export benchmark results to JSON |
125 | | -./examples/02-memory-cache/bench/aos_soa_bench --benchmark_format=json > results.json |
126 | | -``` |
127 | | - |
128 | | -## Profiling |
129 | | - |
130 | | -Generate FlameGraph visualizations: |
131 | | - |
132 | | -```bash |
133 | | -# Record performance data |
134 | | -./tools/flamegraph/generate_flamegraph.sh ./build/release/examples/02-memory-cache/bench/aos_soa_bench |
135 | | - |
136 | | -# View the generated SVG |
137 | | -firefox flamegraph.svg |
| 19 | +ctest --preset=release |
138 | 20 | ``` |
139 | 21 |
|
140 | 22 | ## Documentation |
141 | 23 |
|
142 | | -- [Learning Path](docs/en/learning-path.md) - Recommended order for studying examples |
143 | | -- [Profiling Guide](docs/en/profiling-guide.md) - How to profile and analyze performance |
144 | | -- [GitBook Sync Guide](docs/en/gitbook-sync.md) - Connect this repository to GitBook for online reading |
145 | | - |
146 | | -## Contributing |
147 | | - |
148 | | -1. Fork the repository |
149 | | -2. Create a feature branch |
150 | | -3. Ensure all tests pass with sanitizers |
151 | | -4. Submit a pull request |
152 | | - |
153 | | -## License |
154 | | - |
155 | | -MIT License - see [LICENSE](LICENSE) for details. |
| 24 | +- Docs site: https://lessup.github.io/cpp-high-performance-guide/ |
| 25 | +- Learning path: `docs/en/learning-path.md` and `docs/zh/learning-path.md` |
| 26 | +- Profiling guide: `docs/en/profiling-guide.md` and `docs/zh/profiling-guide.md` |
| 27 | +- HonKit or GitBook sync notes: `docs/en/gitbook-sync.md` and `docs/zh/gitbook-sync.md` |
156 | 28 |
|
157 | | -## Acknowledgments |
| 29 | +## Development |
158 | 30 |
|
159 | | -- [Google Benchmark](https://github.com/google/benchmark) |
160 | | -- [Google Test](https://github.com/google/googletest) |
161 | | -- [RapidCheck](https://github.com/emil-e/rapidcheck) |
162 | | -- [FlameGraph](https://github.com/brendangregg/FlameGraph) |
| 31 | +- Contribution guide: `CONTRIBUTING.md` and `CONTRIBUTING.zh.md` |
| 32 | +- Changelog: `changelog/` |
| 33 | +- License: `LICENSE` |
0 commit comments