CN114359920B - Image processing method, device, equipment and storage medium - Google Patents

Image processing method, device, equipment and storage medium

Info

Publication number
CN114359920B
CN114359920B CN202011065951.9A CN202011065951A CN114359920B CN 114359920 B CN114359920 B CN 114359920B CN 202011065951 A CN202011065951 A CN 202011065951A CN 114359920 B CN114359920 B CN 114359920B
Authority
CN
China
Prior art keywords
images
image
document
video
mark
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011065951.9A
Other languages
Chinese (zh)
Other versions
CN114359920A (en
Inventor
王倩
林彬彬
邓佳康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN202011065951.9A priority Critical patent/CN114359920B/en
Publication of CN114359920A publication Critical patent/CN114359920A/en
Application granted granted Critical
Publication of CN114359920B publication Critical patent/CN114359920B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Processing Or Creating Images (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The application discloses an image processing method, an image processing device, image processing equipment and a storage medium, wherein the method comprises the steps of identifying the image content of N images; when the image content of the N images is identified to contain document materials, the document material images are intercepted from the N images to obtain M intercepted images, the M intercepted images are spliced, and the spliced file is output in an electronic document format. According to the scheme provided by the embodiment of the application, the document data images are intercepted from the images containing the document data, the intercepted document data images are spliced, and the spliced file is output in the electronic document format, so that the time for sorting documents such as PPT or courseware is effectively reduced, and the efficiency is improved.

Description

Image processing method, device, equipment and storage medium
Technical Field
The present invention relates generally to the field of image technology, and in particular, to an image processing method, apparatus, device, and storage medium.
Background
With the development of technology, at present, in meetings, training and teaching, modes of applying other document materials such as PPT or courseware are very popular, and the mode of applying documents such as PPT or courseware is adopted to carry out lecture so as to bring convenience to lecturers, so that the inefficiency of writing on a whiteboard or a blackboard in real time during lecture can be avoided, but inconvenience is brought to students, and the real-time writing time of the lecturers is saved when the mode of applying documents such as PPT or courseware is adopted, so that the lecture speed is relatively high, and the students can not take notes.
At present, most listeners record the contents of documents such as PPT or courseware by adopting a video recording or shooting mode, and after the lecture is finished, the documents such as PPT or courseware are arranged, so that the mode has lower efficiency.
Disclosure of Invention
In view of the foregoing drawbacks or shortcomings in the prior art, it is desirable to provide an image processing method, apparatus, device, and storage medium.
In a first aspect, the present application provides an image processing method, the method comprising:
Identifying image content of the N images;
When the image content of the N images is identified to contain document data, intercepting the document data images from the N images to obtain M intercepted images;
splicing the M intercepted images, and outputting a spliced file in an electronic document format;
Wherein N is a positive integer, and M is a positive integer less than or equal to N.
In one embodiment, the image is a video frame image;
before identifying the image content of the N images, further comprising:
Acquiring a mark point of a mark record in a target video;
and determining a video frame image corresponding to the mark point in the target video according to the mark point.
In one embodiment, before obtaining the mark point recorded in the mark record in the target video, the method further includes:
receiving a mark input on a target video in the recording process or the playing process of the target video;
marking a mark point in a corresponding video frame image in the target video in response to the mark input;
Wherein each marking point corresponds to a video frame image.
In one embodiment, stitching the truncated images includes:
Acquiring a playing time sequence of document material images corresponding to M intercepted images in a target video;
Determining a first splicing sequence of the M Zhang Jiequ images according to the playing time sequence;
and splicing M intercepted images according to a first splicing sequence.
In one embodiment, stitching the truncated images includes:
Determining a document page number of a document material image corresponding to the M Zhang Jiequ image;
determining a second splicing sequence of the M Zhang Jiequ images according to the page numbers of the documents;
and splicing M intercepted images according to a second splicing sequence.
In one embodiment, the step of capturing the document data image from the N images to obtain M captured images comprises:
In the case where the same document material image exists in the document material images of the N image cuts, one of the same document material images is taken as one cut image.
In one embodiment, when any boundary of the document material contained in the image content is identified, an included angle exists between the boundary corresponding to the image in which the document material is located, and the included angle is larger than an included angle threshold value,
Intercepting document material images from N images, including:
And performing perspective correction clipping on the document data.
In one embodiment, the electronic document format includes any one of a presentation file format, a PDF format, a rich text format, a word format, and a text editing system document format.
In one embodiment, the documentation includes any one of PPT documentation and courseware documentation.
In a second aspect, the present application provides an image processing apparatus comprising:
The identification module is used for identifying the image content of the N images;
the intercepting module is used for intercepting document data images from the N images to obtain M intercepted images when the image contents of the N images are identified to contain document data;
and the output module is used for splicing the M intercepted images and outputting the spliced file in an electronic document format.
In a third aspect, the present application provides an apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the image processing method as in the first aspect when executing the program.
In a fourth aspect, the present application provides a readable storage medium having stored thereon a computer program which, when executed by a processor, implements the image processing method as in the first aspect.
According to the technical scheme provided by the embodiment of the application, the document data images are intercepted from the images containing the document data, the intercepted document data images are spliced, and the spliced file is output in an electronic document format, so that the time for sorting documents such as PPT or courseware is effectively reduced, and the efficiency is improved.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:
fig. 1 is a schematic flow chart of an image processing method according to an embodiment of the present invention;
Fig. 2 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be noted that, for convenience of description, only the portions related to the application are shown in the drawings.
In order to make the present application better understood by those skilled in the art, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the described embodiments of the application may be implemented in other sequences than those illustrated or otherwise described herein.
Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules that are expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
At present, in meetings, training and teaching, modes of applying other document materials such as PPT or courseware are very popular, the mode of applying documents such as PPT or courseware is adopted to carry out lecture, convenience is brought to a lecturer, the inefficiency of writing on a whiteboard or a blackboard in real time during lecture can be avoided, and the time of writing in real time of the lecturer is saved when the mode of applying documents such as PPT or courseware is adopted, so that the lecture speed is relatively high, and a lecturer can not take notes.
At present, most listeners record the contents of documents such as PPT or courseware by adopting a video recording or shooting mode, and after the lecture is finished, the PPT or courseware is arranged, so that the mode has lower efficiency.
Based on the above problems, the application is expected to provide an image processing method, which has high efficiency and high user satisfaction when finishing the document materials such as PPT or courseware recorded in a video recording or photographing mode.
The method can be applied to terminal equipment provided with a camera, wherein the terminal equipment can be a mobile phone, a tablet computer, a notebook computer, an intelligent helmet, intelligent glasses, a telephone watch and the like.
It should be noted that, in the image processing method provided in the embodiment of the present invention, the execution body may be an image processing apparatus, and the image processing apparatus may be implemented as part or all of the terminal device by software, hardware, or a combination of software and hardware. In the following method embodiments, the execution subject is a terminal device.
Referring to fig. 1, a flowchart of an image processing method according to an embodiment of the present application is shown.
As shown in fig. 1, an image processing method may include:
s110, identifying the image content of the N images.
Specifically, the image may be a video frame image (i.e., a frame image corresponding to a certain frame in a video), or may be a picture image (e.g., a photograph taken by a camera, a screen capturing image, etc.), or the like. The image may be obtained directly from a terminal device that records a video or a picture, or may be obtained from a storage device that stores the recorded video or picture, or may be obtained by obtaining the image by downloading, and the form and the obtaining mode of the image are not limited.
Identifying the image content of the image may be accomplished by training the neural network. Identification may also be made in other ways.
If the image is a picture image, the obtained picture image is directly input into a neural network model, and then the image content identification of the image can be completed.
If the image is a video frame image, the acquired video needs to be processed to obtain the video frame image.
In one embodiment, the image is a video frame image, and prior to identifying the image content image of the N images, the method further comprises:
Acquiring a mark point of a mark record in a target video;
and determining a video frame image corresponding to the mark point in the target video according to the mark point.
Specifically, the target video is a video recorded by the user, a stored video, a video obtained by downloading, or the like, which has a mark point recorded by a mark. The marking points of the marking record can be input by a user, or can be input by terminal equipment and the like, wherein the number of the marking points is N, and N is a positive integer. The video frame image may be any frame image in the images, in this embodiment, the video frame image is a video frame image corresponding to a mark point in the target video, and N mark points are marked, that is, N Zhang Shipin frame images are corresponding.
In one embodiment, before obtaining the mark point of the mark record in the target video, the method further includes:
receiving a mark input on a target video in the recording process or the playing process of the target video;
marking a mark point in a corresponding video frame image in the target video in response to the mark input;
Wherein each marking point corresponds to a video frame image.
Specifically, when a user records or plays a video, the user marks a mark point on the video when recording or playing the video according to actual needs. When marking the mark points, automatic marking can be carried out at intervals of preset time, and the preset time can be set according to actual needs. It can be understood that if the preset time length is set too large, namely, the mark points are marked for a long time interval, the images containing the document materials in the image content can be missed to be marked, and if the preset time length is set too small, namely, the mark points are marked for a short time interval, the images containing the same document materials in the image content can be repeated for a plurality of times, and when the images are identified, the images to be identified are more and the time consumption is long. The preset time period can be set according to learning training of the neural network model.
The marking point can be manually marked when a user turns pages according to the PPT of a lecture or courseware and the like while recording or playing the video.
The marking point can also be judged in real time by adopting an algorithm for judging whether the document materials contained in the image contents of the images of the adjacent frames are changed, if so, the marking point can be automatically marked, or a popup window can be used for inquiring whether the user needs marking or not, and the user selects whether to mark the marking point or not according to the actual requirement. It should be noted that, the manner of marking the points on the video may also be other manners, which are not limited herein.
After the target video and the marked points recorded in the target video are obtained, the marked video frame image can be determined according to the marked points in the target video. When the image content of the image is identified, the neural network model can be input to all marked video frame images, and the determination of whether the image content of the video frame images contains document materials can be completed.
S120, when the image content of the N images is identified to contain the document materials, the document material images are intercepted from the N images, and M intercepted images are obtained.
Specifically, when it is identified that the image content of the image includes document data, the image may be too dark or overexposed due to the recorded environmental factors, and at this time, the image that is too dark or overexposed needs to be processed first to the brightness normal range, and the processing may be performed by adopting the prior art, which is not described herein. Alternatively, the documentation may include any of PPT documents, courseware documents, and the like.
And detecting the boundary of the document material in the processed image according to the boundary recognition technology, and cutting the image according to the detected boundary. It can be understood that, in order to make the cut document data image attractive, when the image is cut according to the detected boundary, the surrounding boundaries can be all extended outwards (i.e. the left boundary extends leftwards, the right boundary extends rightwards, the upper boundary extends upwards, and the lower boundary extends downwards) for a preset length, and the preset lengths of the four directions can be equal or unequal, and can be set according to actual requirements.
In one embodiment, in the step of capturing document material images from N images to obtain M captured images:
In the case where the same document material image exists in the document material images of the N image cuts, one of the same document material images is taken as one cut image.
Specifically, since the document data images intercepted from the N images may have the same document data image, it may be determined whether the document data included in the image content of the intercepted images has the same document data, and if so, one of the corresponding intercepted images in the same document data is retained, and the other corresponding intercepted images in the same document data are all rejected.
When judging whether the same document materials exist in the image contents of the intercepted images, a comparison algorithm of texts in the images can be adopted to compare the document materials contained in all the intercepted images.
From the above, because the same document data image exists in the intercepted document data image, the number of intercepted images obtained by interception may be smaller than or equal to the number of images, that is, the number M of screenshot images obtained is a positive integer smaller than or equal to the number N of images.
When video is recorded, the video is not normally recorded against a screen, that is, the document materials contained in the recorded video are inclined (wherein inclination refers to any boundary of the document materials, and an included angle is formed between the boundary corresponding to an image where the document materials are located, and the included angle is larger than an included angle threshold). Therefore, when document materials are intercepted, it is necessary to process them by using a tilt detection and correction method. That is, the document is first detected and if the document is tilted, the document needs to be corrected. Examples of the tilt detection method that is generally used include a text line-based detection method, a projection contour analysis method, and a Hough transform method.
In one embodiment, when any boundary of the document materials contained in the image content is identified, an included angle exists between the boundary corresponding to the image where the document materials are located, and the included angle is larger than an included angle threshold value, the document materials in the image are intercepted, and perspective correction clipping is carried out on the document materials.
Specifically, perspective correction clipping is performed on the document data, that is, the included angles between all the boundaries of the document data and the boundaries corresponding to the image where the document data is located are corrected to be within the included angle threshold, which may be a Photoshop technique, a distorted document image restoration technique, or other techniques, without limitation.
The included angle threshold may be set according to actual requirements, and exemplary, the included angle threshold may be set to 5 °.
S130, splicing the intercepted images, and outputting the spliced file in an electronic document format.
Specifically, the cut-out image is a document data image cut out from the image, the cut-out image is spliced, the cut-out image can be spliced to obtain a spliced image, the spliced image is output into a spliced file in an electronic document format, the cut-out image can be input into a word document, a PPT document, a PDF document or the like, the cut-out image is spliced in any document, or each cut-out image is respectively used as one page in the document in any document, and then the spliced image is uniformly output into the spliced file in the electronic document format. After outputting the spliced file in the electronic document format, a path for storing the file can be sent to the user, and the stored file can be found in the file manager.
The electronic document Format may be set according to actual needs of the user, and optionally, the electronic document Format may include any one of a presentation file Format, a PDF (Portable Document Format ) Format, a Rich Text Format (RTF) Format, a word Format, and a Text editing system document (Word Processing System, WPS) Format. The method can also display the image format such as a message Excel workbook format, a webpage format, an MHT file format and the like.
It will be appreciated that when a lecturer plays a lecture, there is often a document such as PPT or courseware that has been played back before, and in this case, the video frame image corresponding to the mark point of the person recording the video or the photo taken by the person taking the photo may contain the same content as the video frame image corresponding to the previous mark point or the photo taken. If all the intercepted images are spliced directly, the spliced page number possibly appearing in the obtained spliced image does not correspond to the page number of the original documents such as the PPT or courseware and the like, and the spliced image contains repeated contents. Therefore, at the time of stitching, it is necessary to sort the cut images.
In one embodiment, stitching the truncated images includes:
Acquiring a playing time sequence of document material images corresponding to M intercepted images in a target video;
Determining a first splicing sequence of the M Zhang Jiequ images according to the playing time sequence;
and splicing M intercepted images according to a first splicing sequence.
Specifically, the playing time sequence of the document data image in the target video is related to the time of marking the marking point in the target video, the playing time sequence corresponding to the marking point marked first is before, and the playing time sequence corresponding to the marking point marked later is after, namely, the playing time sequence of the document data image in the target image is the time sequence of marking the marking point.
The first splicing sequence is the display sequence of the intercepted images in the output spliced file, and is consistent with the playing time sequence and is the time sequence when marking the mark points. And splicing M cut images according to the first splicing sequence.
In one embodiment, stitching the truncated images includes:
Determining a document page number of a document material image corresponding to the M Zhang Jiequ image;
determining a second splicing sequence of the M Zhang Jiequ images according to the page numbers of the documents;
and splicing M intercepted images according to the second splicing sequence.
Specifically, in general, the position of the page number in the document material such as PPT or courseware may be set at the top of the page number or the left, middle and right positions of the bottom of the page number, the position of the possible page number in the intercepted image is detected, the page number of the document material image is determined, and the document page number of the M Zhang Jiequ image is determined according to the page number of the document material image.
The second splicing sequence is the display sequence of the intercepted images in the output spliced file, and is consistent with the page sequence of document materials such as PPT or courseware. And splicing M cut images according to the second splicing sequence.
In the embodiment of the application, when the image content of the N images contains the document data, the document data images are intercepted from the N images to obtain M intercepted images, the M intercepted images are spliced, and the spliced file is output in an electronic document format, so that the time for a user to sort documents such as PPT or courseware can be reduced, and the efficiency is improved.
The following describes an image processing method according to the embodiment of the present application by taking recording tag (marker) video as an example.
After the recording is finished, opening a tag video album on a mobile phone, displaying a viewing tag entry, clicking the viewing tag entry, expanding the viewing tag to be the time number of the video frame image corresponding to each tag point, identifying the video frame image corresponding to each tag point to determine whether file data is contained in the video frame image, displaying a file export button on the interface of the mobile phone album when the file data is identified to be contained in the video frame image, clicking the file export button, intercepting and splicing the files in the video, performing perspective correction cutting on pages to be corrected, determining the sequence of the file data based on the page, time and other information of the video frame image, judging whether the same file data exists or not, if so, carrying out duplication removing processing, exporting and storing the file data in a PDF format file in the video, and prompting a storage path of the user file, wherein the file can be found in a file manager.
Fig. 2 is a schematic structural diagram of an image processing apparatus 200 according to an embodiment of the present application. As shown in fig. 2, the apparatus may implement the method shown in fig. 1, and the apparatus may include:
an identification module 210 for identifying image contents of the N images;
a capturing module 220, configured to, when it is identified that the image content of the N images includes document data, capture a document data image from the N images, and obtain M captured images;
And the output module 230 is used for splicing the M intercepted images and outputting the spliced file in an electronic document format.
Optionally, the image is a video frame image, and the apparatus further includes:
the first acquisition module is used for acquiring the target video and mark points recorded in the target video;
And the determining module is used for determining the video frame image corresponding to the mark point in the target video according to the mark point.
Optionally, the apparatus further comprises:
the input receiving module is used for receiving the mark input on the target video in the recording process or the playing process of the target video;
the response module is used for responding to the marking input and marking the marking points in the corresponding video frame images in the target video;
Wherein each marking point corresponds to a video frame image.
Optionally, the output module 230 is further configured to:
Acquiring a playing time sequence of document material images corresponding to M intercepted images in a target video;
Determining a first splicing sequence of the M Zhang Jiequ images according to the playing time sequence;
and splicing M intercepted images according to a first splicing sequence.
Optionally, the output module 230 is further configured to:
Determining a document page number of a document material image corresponding to the M Zhang Jiequ image;
determining a second splicing sequence of the M Zhang Jiequ images according to the page numbers of the documents;
and splicing M intercepted images according to a second splicing sequence.
Optionally, the interception module 220 is further configured to:
In the case where the same document material image exists in the document material images of the N image cuts, one of the same document material images is taken as one cut image.
Optionally, when any boundary of the document material contained in the image content is identified, an included angle exists between the boundary corresponding to the image where the document material is located, and the included angle is greater than the included angle threshold, the interception module 220 is further configured to:
And performing perspective correction clipping on the document data.
Optionally, the electronic document format includes any one of a presentation file format, a PDF format, a rich text format, a word format, and a text editing system document format.
Optionally, the document material includes any one of PPT document and courseware document.
The image processing device provided in this embodiment may execute the embodiment of the method, and its implementation principle and technical effects are similar, and will not be described herein.
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 3, a schematic structural diagram of an electronic device 300 suitable for use in implementing embodiments of the present application is shown.
As shown in fig. 3, the electronic device 300 includes a Central Processing Unit (CPU) 301 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 302 or a program loaded from a storage section 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data required for the operation of the device 300 are also stored. The CPU 301, ROM 302, and RAM 303 are connected to each other through a bus 304. An input/output (I/O) interface 306 is also connected to bus 304.
Connected to the I/O interface 305 are an input section 306 including a keyboard, a mouse, and the like, an output section 307 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like, a storage section 308 including a hard disk, and the like, and a communication section 309 including a network interface card such as a LAN card, a modem, and the like. The communication section 309 performs communication processing via a network such as the internet. The driver 310 is also connected to the I/O interface 306 as needed. A removable medium 311 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed on the drive 310 as needed, so that a computer program read therefrom is installed into the storage section 308 as needed.
In particular, according to embodiments of the present disclosure, the process described above with reference to fig. 1 may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a machine-readable medium, the computer program comprising program code for performing the above-described image processing method. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 309, and/or installed from the removable medium 311.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units or modules involved in the embodiments of the present application may be implemented in software or in hardware. The described units or modules may also be provided in a processor. The names of these units or modules do not in some way constitute a limitation of the unit or module itself.
In another aspect, the present application also provides a storage medium, which may be a storage medium included in the foregoing apparatus in the foregoing embodiment, or may be a storage medium that exists alone and is not assembled into a device. The storage medium stores one or more programs for use by one or more processors in performing the image processing methods described in the present application.
The above description is only illustrative of the preferred embodiments of the present application and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the application referred to in the present application is not limited to the specific combinations of the technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the inventive concept. Such as the above-mentioned features and the technical features disclosed in the present application (but not limited to) having similar functions are replaced with each other.

Claims (10)

1. An image processing method, comprising:
Identifying the image content of N images, wherein the images are at least one of video frame images or picture images;
When the image content of the N images is identified to contain document materials, intercepting the document material images from the N images to obtain M intercepted images;
splicing the M Zhang Jiequ images, and outputting a spliced file in an electronic document format;
Wherein N is a positive integer, M is a positive integer less than or equal to N;
When the image is a video frame image, the method further comprises, prior to the identifying the image content of the N images:
Determining the video frame image corresponding to the mark point in the target video according to the mark point, wherein the target video is video with the mark point recorded by a mark, such as video recorded by a user, stored video or video obtained by downloading, and each mark point corresponds to one video frame image;
Before the mark points recorded in the mark records in the target video are obtained, the method further comprises the following steps:
Receiving a mark input on the target video in the recording process or the playing process of the target video;
marking a mark point in the corresponding video frame image in the target video in response to the mark input;
the generation process of the mark input comprises the following steps:
And acquiring the change condition of the document materials contained in the image content of the adjacent video frame images, and generating the mark input in an automatic mark mode or a popup window generating and inquiring mode of a user if the document materials contained in the image content of the adjacent video frame images are changed.
2. The method of claim 1, wherein stitching the M Zhang Jiequ images comprises:
acquiring a document data image corresponding to the M Zhang Jiequ image and a playing time sequence in the target video;
Determining a first splicing sequence of the M Zhang Jiequ images according to the playing time sequence;
and splicing the M Zhang Jiequ images according to the first splicing sequence.
3. The method of claim 1, wherein stitching the M Zhang Jiequ images comprises:
determining a document page number of a document material image corresponding to the M Zhang Jiequ image;
Determining a second splicing sequence of the M Zhang Jiequ images according to the document page number;
and splicing the M Zhang Jiequ images according to the second splicing sequence.
4. The method of claim 1, wherein the step of capturing document material images from said N images to obtain M captured images comprises:
in the case where the same document material image exists in the document material images of the N image capturing, one of the same document material images is taken as one of the capturing images.
5. The method according to claim 1, wherein when any one of the boundaries of the document material contained in the image content is identified, an included angle exists between the boundaries corresponding to the image in which the document material is located, and the included angle is larger than an included angle threshold,
The capturing document material images from the N images includes:
and performing perspective correction clipping on the document data.
6. The method of claim 1, wherein the electronic document format comprises any one of a presentation file format, a PDF format, a rich text format, a word format, and a text editing system document format.
7. The method of claim 1, wherein the documentation includes any one of PPT documentation and courseware documentation.
8. An image processing apparatus, comprising:
The image recognition module is used for recognizing the image content of N images, wherein the images are at least one of video frame images or picture images;
the intercepting module is used for intercepting document data images from the N images to obtain M intercepted images when the image contents of the N images are identified to contain document data;
the output module is used for splicing the M Zhang Jiequ images and outputting spliced files in an electronic document format;
When the image is a video frame image, the apparatus is further configured to, prior to the identifying the image content of the N images:
Determining the video frame image corresponding to the mark point in the target video according to the mark point, wherein the target video is video with the mark point recorded by a mark, such as video recorded by a user, stored video or video obtained by downloading, and each mark point corresponds to one video frame image;
Before the mark points recorded in the mark records in the target video are obtained, the method further comprises the following steps:
Receiving a mark input on the target video in the recording process or the playing process of the target video;
marking a mark point in the corresponding video frame image in the target video in response to the mark input;
the generation process of the mark input comprises the following steps:
And acquiring the change condition of the document materials contained in the image content of the adjacent video frame images, and generating the mark input in an automatic mark mode or a popup window generating and inquiring mode of a user if the document materials contained in the image content of the adjacent video frame images are changed.
9. An apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the image processing method of any of claims 1-7 when the program is executed by the processor.
10. A readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the image processing method according to any one of claims 1-7.
CN202011065951.9A 2020-09-30 2020-09-30 Image processing method, device, equipment and storage medium Active CN114359920B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011065951.9A CN114359920B (en) 2020-09-30 2020-09-30 Image processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011065951.9A CN114359920B (en) 2020-09-30 2020-09-30 Image processing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114359920A CN114359920A (en) 2022-04-15
CN114359920B true CN114359920B (en) 2025-07-29

Family

ID=81090172

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011065951.9A Active CN114359920B (en) 2020-09-30 2020-09-30 Image processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114359920B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117749966B (en) * 2022-09-13 2025-07-25 荣耀终端股份有限公司 Document processing method and electronic device
CN115270019A (en) * 2022-09-23 2022-11-01 南方电网数字电网研究院有限公司 Method and system for localizing webpage version electronic document

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110381382A (en) * 2019-07-23 2019-10-25 腾讯科技(深圳)有限公司 Video takes down notes generation method, device, storage medium and computer equipment
CN110414352A (en) * 2019-06-26 2019-11-05 深圳市容会科技有限公司 Method and related equipment for extracting PPT file information from video file

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6760463B2 (en) * 1995-05-08 2004-07-06 Digimarc Corporation Watermarking methods and media
US8467660B2 (en) * 2011-08-23 2013-06-18 Ash K. Gilpin Video tagging system
CN106131627B (en) * 2016-07-07 2019-03-26 腾讯科技(深圳)有限公司 A kind of method for processing video frequency, apparatus and system
KR20190024182A (en) * 2017-08-31 2019-03-08 주식회사 코아비즈 Method for recognizing image of multi document
US10733469B2 (en) * 2017-12-29 2020-08-04 Idemia Identity & Security USA LLC Capturing digital images of documents
US10735793B1 (en) * 2019-03-15 2020-08-04 Adobe Inc. Recording and playing back image variations
CN110427819B (en) * 2019-06-26 2022-11-29 深圳职业技术学院 A method and related equipment for identifying PPT borders in images
CN110490101A (en) * 2019-07-30 2019-11-22 平安科技(深圳)有限公司 A kind of picture intercept method, device and computer storage medium
CN110493640A (en) * 2019-08-01 2019-11-22 东莞理工学院 A kind of system and method that the Video Quality Metric based on video processing is PPT

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414352A (en) * 2019-06-26 2019-11-05 深圳市容会科技有限公司 Method and related equipment for extracting PPT file information from video file
CN110381382A (en) * 2019-07-23 2019-10-25 腾讯科技(深圳)有限公司 Video takes down notes generation method, device, storage medium and computer equipment

Also Published As

Publication number Publication date
CN114359920A (en) 2022-04-15

Similar Documents

Publication Publication Date Title
WO2022089170A1 (en) Caption area identification method and apparatus, and device and storage medium
WO2021035223A1 (en) Automatic data extraction and conversion of video/images/sound information from a board-presented lecture into an editable notetaking resource
US20140164927A1 (en) Talk Tags
CN105808782A (en) Picture label adding method and device
CN112306601B (en) Application interaction method, device, electronic device and storage medium
KR102292775B1 (en) System and method for providing learning service
CN110085068A (en) Learning tutoring method and device based on image recognition
CN111027537B (en) A question searching method and electronic device
WO2019033656A1 (en) Board-writing processing method, device and apparatus, and computer-readable storage medium
US20210012511A1 (en) Visual search method, computer device, and storage medium
CN113840099B (en) Video processing method, device, equipment and computer readable storage medium
CN114359920B (en) Image processing method, device, equipment and storage medium
CN111723653B (en) Method and device for reading drawing book based on artificial intelligence
CN111881904A (en) Blackboard writing recording method and system
CN111079777B (en) Page positioning-based click-to-read method and electronic equipment
CN111753715A (en) Method and device for shooting test questions in click-to-read scene, electronic equipment and storage medium
CN118524240B (en) Streaming media file generation method, terminal and storage medium
WO2023272656A1 (en) Picture book recognition method and apparatus, family education machine, and storage medium
CN114005121A (en) A text recognition method and device for a mobile terminal
CN112270295A (en) Method and device, terminal device and storage medium for framing questions in student homework scenario
CN111081088A (en) Dictation word receiving and recording method and electronic equipment
CN111464865B (en) Video generation method and device, electronic equipment and computer readable storage medium
CN111582281B (en) Picture display optimization method and device, electronic equipment and storage medium
KR102192558B1 (en) System and method for managing lecture for sharing taking notes
CN116434253A (en) Image processing method, device, equipment, storage medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant