
APM Agent Developer at Nathan Claire Africa
- Kenya
- Permanent
- Full-time
- This is a remote position.
- The developer will be responsible for building a lightweight, self-healing, auto-scaling, multi-platform APM (Application Performance Monitoring) agent that can:
- Automatically instrument applications to collect transaction traces, logs, and performance metrics.
- Capture distributed tracing across microservices.
- Track response times, error rates, resource usage, and database query performance.
- Collect and forward application, system, and security logs.
- Implement adaptive sampling to reduce overhead.
- Ensure async & non-blocking data collection.
- Optimize CPU, memory, and network utilization to minimize application impact.
- Assign and propagate trace IDs across microservices.
- Monitor slow queries and database calls with minimal overhead.
- Collect, filter, and forward application/system logs.
- Detect security anomalies and unusual resource usage patterns.
- Efficiently batch and compress data before sending to the APM platform.
- Use lightweight protocols (gRPC, Protobuf, etc.) for communication.
- Implement triggers for auto-scaling based on CPU, memory, and latency thresholds.
- Enable self-healing by restarting services upon failure or excessive resource usage.
- Develop the agent within 2 to 3 weeks.
- Provide technical documentation and performance benchmarks.
- Programming Expertise: Proficiency in languages commonly used for APM agents such as Java, Python, Go, .NET, or C++.
- Instrumentation & Monitoring Experience: Hands-on experience with code profiling, distributed tracing (OpenTelemetry), and application instrumentation.
- Performance Optimization: Knowledge of efficient data collection strategies, async programming, and low-latency data transmission.
- Logging & Security: Experience integrating with logging pipelines (ELK, Splunk, Loki) and implementing basic security anomaly detection.
- Scalability & Resilience: Familiarity with auto-scaling, self-healing mechanisms, and cloud-native architectures.
- APM & Observability Tools: Experience with tools like Prometheus, OpenTelemetry, Datadog, New Relic, or Dynatrace is a plus.
- Networking & Communication Protocols: Proficiency in gRPC, Protobuf, or HTTP-based telemetry data transfer.
- Agile Development & Fast-Paced Execution: Ability to deliver a functional prototype within 2-3 weeks and iterate based on feedback.
- Strong Debugging & Problem-Solving Skills: Ability to analyze performance bottlenecks and optimize agent behavior.
Jobs in Kenya