We present our methodology in applying a well established statistical dynamic power prediction technique in a production environment to an embedded commercial ‘scalar and vector co-processor’. The pitfalls faced and solutions to guide the statistical solver to build a low error power predictor model are discussed. In our proposed method, we extracted processor stall probe-points, used selective microarchitectural events (which are later discarded), created instruction groups and performed short performance event selection to refine the power-model. Our approach to determine the processor’s dynamic power floor and right power-sampling window size in an architectural trace and the tests selected for training are explained. Our flow results in power weights for a set of architecturally visible events as well as few optional microarchitectural events of the processor. Using the weights, a canonical power prediction equation (that is configurable with user specified granularity of the abstraction of events) was auto-generated. On comparing the predicted power results of our proposed method against the golden power numbers from a commercial EDA tool, we obtain an average power error of 8% and reasonably track instantaneous power of new unseen real application workloads.