VSD - Distributed timing analysis within 100 lines code


1) What happens when you type set_multi_cpu_usage -localCpu 4 on your EDA timing shell?
2) What happens when you type set_multi_cpu_usage -localCpu 4 -numThreads 4 on your EDA timing shell?

I had a curiosity, while working at my previous design companies, about how jobs are getting spawned on different machines? What if there are less machines and more jobs, and vice versa? How does the algorithm of a timing engine handles this?I myself used to setup the entire distributed MMMC framework for timing tools at customer place, which was just setting the right variables (set_multi_cpu_usage), but never knew what goes behind the tools. Its the curiosity which leads to queries which leads to exploration and finally, leads toanswers. I found my answers from Tsung-Wei, who is the architect of popular opensource STA Tool Opentimer.

We all know timing analysis is a really important task in overall chip design flow and its so complex and difficult task. The chip that we incorporate today has billions of transistors, resulting timing analysis runtime is tool large. Also, we need to analyze timing under different conditions, so its not just a single run that you get a final result. While there are several solutions to mitigate this computation issue, the problem is most of the work is architecturally constrained by
single machine. And as design complexity continue to grow larger and larger, we have to add more and more CPU and memories to the machine, but not very cost-efficient

There are multiple places, we can introduce distributed computing to timing and major motivation is to speed up the timing closure. We have to analyze timing under different range of conditions, typically quantified as modes (test mode, functional mode) and corner (PVT). The number of combinations (timing views) you have to run is typically increasing exponentially with lower nodes. That's where you need to need to distribute timing analyses across different machines.
So let's distribute it and do it within 100lines of code using DTCraft - A High-performance cluster computing engine. Welcome to the webinar on "Distributed timing analysis within 100 lines of code"

  • Learn, code, analyze distributed framework
  • Take up and run STA for challenging designs with hugh instance count and witness the benefits of distributed STA


  • Introduction
  • Need for distributed-STA
  • Explain parallelism in right way
    • DTCraft installation steps and webinar outline
    • Distributed timing concept and big-data tool issues
    • Hard-coded distributed MMMC framework
    • A new solution - DTCraft
  • DTCraft Labs

    • Vanilla 'Hello World example using DTCraft
    • 'hello world' example description
    • Steps to compile 'hello world' example
    • Steps to run 'hello_world' in DTCraft cluster
  • DTCraft labs in static timing analysis
    • Distributed timing analysis for 3 timing views on 3 machines
    • DTCraft test-run with 3 tming views using 2 machines
    • QnA with participants on DtCraft runs for 3-timing views
    • Steps to overload memory and assignment description
  • Conclusion and assignment

Audience Profile

  • This course is for people who are proficient with timing concepts and want to move a level ahead, and stay ahead of curve
  • Anyone enthusiastic to learn about distributed timing analysis from scratch i.e. from C++ code level


  • Be able to install and run Opentimer, which is opensource STA tool
  • Be able to understand Unix commands
  • Be able to understand basic STA terms and terminologies which can be learnt on Udemy, from Static timing analysis - Part 1 and 2 course

Tools Used

DtCraft is a general-purpose programming system to make parallel and distributed computing easier to handle.

Buy the course :

Presentation of the video courses powered by Udemy for WordPress.