Lokesh Aravapalli

@avnlk • IIIT Bangalore

FPGA • Systems • Low-Latency Computing

Integrated M.Tech student (graduating 2027). Building high-performance hardware-software co-designs through FPGA/HLS implementation, systems performance analysis, and operating systems optimization. Empirical problem-solver focused on latency-critical computing.

github linkedin codeforces

[projects]

protocol_performance_analysis →

Systems Benchmarking • Go • Protobuf

Kernel-level teardown of TCP, REST, and gRPC on Ubuntu 24.04. 9.9M latency measurements across 75 experiments. Exposing 121× tail amplification via page fault step function (traced to Go allocator constant _MaxSmallSize = 32768).

5g_nr_channel_interleaver →

FPGA Miniproject 2 • Vivado HLS • AXI-Stream

128-bit AXI-Stream dual-implementation: Table-based (0.41µs, 13.6× speedup) via BRAM O(1) lookups vs. algorithmic approach. Bypassed 3GPP bottlenecks through compile-time optimization. Verified against test vectors.

5g_nr_subblock_interleaver →

FPGA Miniproject 3 • HLS Optimization

Recognized 3GPP permutation as compile-time resolvable chunk-level wire assignments. Achieved 2–5 cycles latency, 166 LUTs, 0 DSP/BRAM, ~540MHz Fmax. Demonstrates spec-to-silicon optimization depth.

5g_nr_pdsch_bit_interleaver →

FPGA Miniproject 1 • Vivado HLS • Spartan-7

Hardware implementation of bit interleaving for PDSCH channels. Navigated Vivado RTL export bugs and LUT over-utilization. Latency: 18.8µs • Score: 9.5/10

systolic_dcim_architecture →

Conference Paper • In Revision

9×4 SRAM array for CNN inference via systolic topology. Novel in-memory compute architecture with tiling-based scalability methods. Status: Reviewer feedback → post-layout results in progress