Text-Vision Co-Instructed Editing Combines Semantic Intent with Spatial Precision | HACKOBAR_