As System-on-Chip (SoC) designs grow in complexity and size, fabricating monolithic dies becomes increasingly costly, making chiplet-based integration a promising alternative. However, existing partitioning methods often produce sub-optimal results due to inaccurate estimation of design quality, especially when physical information such as chiplet shape is unavailable. To address this challenge, we propose a chiplet partitioning and placement co-optimization flow that integrates placement into the partitioning stage, enabling more accurate evaluation of power, latency, and recurring engineering cost. By performing concurrent placement and chiplet shape adjustment during partitioning, our method overcomes the lack of early physical information and achieves more accurate quality estimation and superior overall design quality, improving final design quality by up to 37.87% with an average improvement of 13.56% compared to conventional two-stage flows.